 Good morning, welcome to our presentation this morning on open stack divination to delivery We're going to talk a little bit about how to start an operations team and some of the challenges and difficulties that may come from trying to build the operations team We're going to talk a little bit about How Operations teams kind of get started such as You find yourself with a running production environment and then now you need to build the team And then we'll talk about some of the things that you need to do to keep in mind as you are building the team Then we'll cover some challenges and opportunities to take advantage of Some insight that other people have run into as they're building their teams and some of the experiences that we've had So that if you're trying to build your own operation team that maybe you can take advantage of some of that knowledge and Then we'll summarize and and get a questions and answers So I introduce myself. I'm Heather Maniscalco. I've been a Walmart now for since November My focus has always been on for like the last 15 years My focus has been on technology modernization enterprise processes and application implementation So I don't really come from the low-level engineering world. I'm more at the service management level So for example, I've been a people soft architects Oracle solutions engineer. So I've been a DBA and an engineer just more at the enterprise application level so that's where I've kind of the space that I've played in and at Walmart I've kind of taken on the role of working with our service management infrastructure Technologies and that layer of stuff that lays between, you know, the the phone call coming in from your help desk to Actually, then going to your orchestration layer or your top tier in your support operations office So that's kind of my focus right now So as you might imagine and I might be the only one I really appreciated the Gartner keynote yesterday The bimodal IT I'm sure I heard a lot of crone sitting around around me But I really understood what she was saying and I think it's very important and we'll kind of get into a little bit of that later on just how important understanding that both modes are very important to your to your stack and your operations and sometimes you need to Really codify and protect that mode one layer and I'll get into a little bit of that when I talk about ITS and solutioning later So I did want to take just a moment to discuss who we are We are from Walmart and I know that probably everyone in this room knows what Walmart is and you've probably been inside of Walmart at least once but what you may not know is just how large our Technology initiative is right now. It's in the billions of dollars of trying to actually Elevate and move forward swiftly into a really modern digitized world So we have 2.2 million employees globally. So we consider actually the largest employer on the planet We actually are considered that we are the largest retailer with the largest revenue We have 11,500 stores in 28 countries And we had over 480 billion dollars worth of sales last year Why am I saying that it's not to brag about Walmart? It's to just tell you just how large of an initiative this is and what we're going to talk about We're not doing perfectly because we we are pretty massively scaled So I'm going to turn over Scott, but I just kind of want to give you an idea of just how big this operation is and How complex Well that we're dealing in at this layer Yeah, and my name is Scott at Kins I came up through the CIS admin career as most of you probably have as well I ended up at Comcast and spent several years with the Comcast team building up the OpenStack team and being part of the Integral building of the operations team at Comcast. I now work at Walmart Haven't been at Walmart for very long. None of us really have and so our story At Walmart so far is that we're we're helping with the the operations and and Taking basically OpenStack to another level from where Where we've talked about in the past at summits with just building e-commerce so everything starts in the beginning with a POC generally most companies start that way We come from a background of bare metal VMware became pretty integral as part of the virtualization efforts of Hardware and then the cloud came along VMware became very involved in the cloud OpenStack came along matured and became really a force In cloud technology So a lot of companies embrace OpenStack, but there's a lot more companies out there that have yet to Take a look at OpenStack and give it a try. So what you end up doing is setting up a POC and Proving that the OpenStack does work for your company and that there are real use cases that the company can take advantage of It's usually set up on on leftover hardware or lab hardware stuff. That's not actually being used and And then you find out that it actually works at that point You you talk to management you talk to Sales you do whatever is necessary to kind of take it to the next level and you get get it approved Somewhere along the process you Involve other users at the company other teams who are interested in cloud technology They bring their apps onto the platform just to see if there's any issues with it if they can develop for the cloud platform if this is something that's going to be really good for the company and Guess what now you have production Operations is usually kind of an afterthought at that point because a whole effort is nothing more than just trying to get it up off the ground and putting some steam into it to try to Kind of get it to the point where you do want to take it to the next level So what happens after it's approved? Well now you buy hardware You take the OpenStack stuff that you installed and package it up you finalize the configurations you look at Stuff like Ansible and chef puppet whatever your favorite method of deploying configurations You really start to kind of pull the whole thing together The teams are usually pretty small at that point. It's an engineering team They had maybe engineering DevOps combined operations is kind of done as part of that effort because well you build it you run it type philosophy, but At this point you need to really start looking at operations as something that needs to be dealt with seriously and Think about all the things that you need to do to run a successful environment without it falling flat and becoming A real problem for the company So what does an operations team do? Well the first and foremost thing is keep the lights on you got to keep the environment running In a healthy way and also be very performant Operations teams do a whole bunch of other stuff all the traditional stuff that a typical sys admin Would do in an environment you deploy new hardware It's often the operations that does that you perform upgrades you do maintenance is You provide users support not only for Running their apps in the cloud, but also just to get into the cloud especially for a company who's just getting started in OpenStack There's a lot of well There's a lack of knowledge on what it takes to run an application in a cloudy way At scale Operations inevitably has to communicate with these users and say that you know you got to run your apps in this way Don't use static addresses. Yeah All the kind of things that are typically part of like legacy applications Operations also Does a lot of monitoring and reporting capacity reporting is usually in there if you get off the ground and and Start getting some steam under running the environment Naturally Upper management is going to come back and say I'd like to see some dashboards and reports How's the capacity? Do we have room to grow? So that's that's a very important aspect of a operations team Not to mention that things always break That's almost always something as sys admins doing is fixing things same thing with an OpenStack engineer OpenStack tends to run better these days with the older versions There was a lot more issues In the end issues change over time So something gets fixed in one release and you get a new problem in another release Being an OpenStack world. That's just kind of the nature of the game So how do you build a team up? Well, one thing that you got to keep in mind is that You can build it just about any way that you feel like it needs to suit your company's needs It needs to support how your team structures are at that company one size model does not fit all Often teams start off as a single unified team of DevOps engineers operations all rolled into one You might split it out so operations is on one side and DevOps and engineering's on another And if the team grows to a certain size, you may actually have them as three separate teams When we were running an operations team Previously our team started off as one large team. It wasn't large at the time. It was One single team, but as the OpenStack environment grew our team had to grow to compensate for that the number of users that come onto the cloud and use it also continue to grow in size and so we found that it was difficult for a single team to manage the building and Mapping out new features that we're going to be in in the next release and also running the cloud and supporting all the users So we broke it into two pieces which was OpenStack and the engineering Roles can be intermingled obviously if you split the teams apart you're more naturally going to split the roles between the teams But that doesn't necessarily mean that you have to have the roles very stringently Defined as if it's this it only can be done by this team if it's this it can only be done by this other team If you split a team into two and have operations engineering expect a lot of commingling of responsibility Which is really healthy for an OpenStack team as a whole And expect that it's going to change over time the bigger the environment gets the needs are going to change And your team structures and the wet what what they do and how they do it will also change What about hiring This is one of the difficult things in the OpenStack community is actually hiring OpenStack engineers One thing that you need to keep in mind is that you don't need to hire only experts If you've got a couple good people on your team that already knows OpenStack You know hire some juniors hire some regular sys admins that are pretty competent at what they do and train them up on OpenStack Kind of build the grassroots from within I'd also say that Hiring someone at least a few people depending on the size of your shop that have an understanding of applications Not only that's what I did But you know having someone understand how applications actually work and live and how people use them is actually a Really useful thing to have on your operations team because you can have engineers and developers and even Linux system administrators Etc. But at the end of the day you also need someone who can bridge that gap and understand how the applications are actually supposed to play and fail and recover and actually You know so the security needs and all that stuff of the application So I think that's very useful to have all your offers to you In our experience we found that hiring OpenStack engineers directly for operations was very difficult One there aren't really a lot of OpenStack engineers that are looking for jobs So that's the first challenge the second challenge is a lot of the OpenStack engineers who are looking for jobs Are kind of more interesting in the engineering side and not necessarily the operations Quite frankly a lot of people find operations a little on the boring side and rather not go that route And if you do find them you may find out that OpenStack engineers are kind of expensive It's hard to acquire them. It's hard to keep them in this kind of environment It is a very competitive environment as we've seen here at the summit There's a lot of interest in hiring OpenStack engineers. It just doesn't come cheaply And the little bull at the very bottom I could have probably moved it up, but it's an important point Hiring and training are expensive. Hiring is definitely expensive So you need to invest your dollars wisely and choose how you want to spend that money It would make sense if you could hire a couple engineers to do it that way that already knows OpenStack But it's a lot cheaper to Hire people that don't know that much about OpenStack or maybe just used OpenStack from a user standpoint Who's built applications either in OpenStack or maybe AWS and then just train them up on what you need from the other side On the training aspect, there's a number of different ways that you could approach it You I've already spoken about how you train people from internal You can kind of mentor mentee type relationship with the more experienced engineers Training up the younger engineers But you also can leverage online training. I know Linux Academy offers OpenStack training which is pretty good and OpenStack now is offering the administrator Certification which is also really good You can go the formal training route as well Mirantis or some other third-party vendor or vendor to either come in or you ship your people to Them to actually be formally trained. That's probably the most expensive route to go but the most successful way to train people up is to use a combination of the both and of all of them and Again, it comes down to money. So use your dollars wisely figure out what works best for you and go that route But be open-minded to try all the different ways so You you find yourself having to do operations. What is it that you need to do right now in order to be successful? Well, very obviously everybody already knows this you need to do monitoring first and foremost and metrics first and foremost without that You're pretty much running blind and if you have an issue you you won't know that it's actually occurring until it's too late Or maybe even how much time you spend trying to figure it out You also need to immediately Create kind of a tiered level support 24-7 coverage Which encompasses an on-call person somebody rotating in someone's always on call and if there's an issue They're the ones that respond and try to figure out what it is We also need to define your documentation as clearly as possible that defines what the escalation process is When something occurs, this is exactly what you do. These are the people you call This is the ticketing system that you open tickets in everything necessary to Define what that process is I think these are probably the most important things that you need just to Get out of the gate with an operations team Looking a little bit further ahead What are the more long-term planning goals that operations team needs to do? Procedures and processes and procedures are probably the weakest part of any operations team. Some companies do it better than others Definitely look at what your company does in other areas for operations because inevitably there's non-cloud operations at a company, but All the different things that need to be considered is you know event management. How do you handle things when they occur? incident management Handling things when they break and how to fix them problem management dealing with things that break Repeatedly or some serious issue Basically root cause analysis and how to try how not to let it happen again change management is very important process as you probably know Most things that break are related to some kind of change specifically Changes that people weren't aware occurring Change management puts more rigorous controls over Over that process so that everybody is aware that those changes are occurring. What is occurring? What is being affected and how to roll back? Maintenance procedures are another important thing for an operations team and also I think probably one of the weaker areas of operations You need to document what your maintenance procedures are if you're automating it It's tends to be more self documentation But anything and everything that you typically do on the command line to as part of maintenance needs to be very well documented You need to have SOPS associated with it and all of that feeds into the change management as kind of it's Point of record for what is going to happen Capacity planning also another very weak area of operations and it's also pretty difficult to do You need to get a handle of not only what is currently in use, but what is expected to be used down the road As part of growing a cloud You kind of need to be in tune with the users if you've got big data users coming onto the cloud you need to Understand how much CPU they're going to use how much storage they're going to use and how that affects Your ability to not only intake them into your cloud today But what is it going to mean six months to a year from now? Do you even have the capacity to? Accommodate them down the road and know that they're not the only Tenant that's going to be coming on the cloud with big needs patching an upgrade strategy Upgrades kind of come naturally It's one of those things that you can't stay on a particular version of open stack forever and With each release of open stack it has gotten easier Now we can do upgrades in place with as little disruption as possible patching is another story and I think every company fights patching In different ways with different sets of policies some if it's not broken don't fix it others may patch more aggressively I think that I'm personally at the view that you need to patch more often than the knot I see a lot of Companies don't patch at all when it comes to open stack they wait for the next release but in this day and age with zero day vulnerabilities and and Important bug fixes that kind of stabilize some of the features that are broken in open stack You need to kind of build in the process to Patch and know that that's going to come and not wait for the zero data occur and go I gotta figure out how to do it now And and end up disrupting All the applications that are in the environment It's also very important to really document patching So notifications you want everyone in the war in your world to know about that you're actually doing the patching You need to have a sign-off strategy So other folks need to know that you're doing the patching so that they're aware and you need to have release notes for your patching I think that kind of documentation is incredibly helpful You also really need to understand your shop's critical times of need So for example at Walmart as you can imagine Black Friday is hyper critical to us and we touch no systems Surrounding that time at all so and have an understanding that calendar and that schedule and actually the prior organization that I they came from Before Walmart also had a very aggressive calendaring system to understand when patchings could occur when they couldn't and the final note on patching is Understand if you have or you want to have a roll-up strategy It depends on how much patching you're actually doing doing a lot of one-off patching Maybe really cumbersome So if you can work with your company's schedule on calendar and you can work on a method to actually roll up Cerebral patches into one release to your system that can also be very useful and and prevent a lot of headache and Automation can't underscore how important automation is From an operation standpoint It touches all kinds of things, but one of the things that I think doesn't get touched enough is automating things like repetitive tasks that like, you know Something's broken and you restart a service or you have to clean up some log files or You know, you need some important information or pull something out of logs From an operation standpoint what I see a lot of teams doing is that they're content with just fixing it by hand and being done with it and moving on Those are areas that can easily be automated and not just that but also pushed back through the The lifecycle of OpenStack and maybe find some permanent fixes or permanent ways to deal with it So that you don't even have the need to repeat or automate those repetitious tasks Self-healing is a One of those things that you could say is kind of one of the repeated repeatable tasks that get done Service breaks your alarm goes off you someone goes in they restart the service Maybe clean up a couple things and they're back running again Self-healing is kind of a technique where a monitoring triggers Some event to occur some script to run that might go restart the service on its own I've seen some products out there that do it monitor stack storm Things like that that you can integrate in to kind of help you along the process of Doing that but again, I will underscore that just because your self-healing doesn't mean that you don't want to push it back through You need to push those things back through the lifecycle of OpenStack because if if you're having to do that There's still a fundamental problem with why it's breaking to begin with so Monitoring is is kind of hard You get some basic monitoring out of the box Most companies already have some kind of monitoring solution in place for other things whether it's Nagas or Sinsu or something like that and generally you'll try to integrate your OpenStack environment into Whatever existing environment there is sometimes you'll stand up your own One of the challenges in OpenStack is actually monitoring everything that needs to be monitored and knowing what what it is That needs monitored and with each version of OpenStack. There's Change there's things that are different than the previous version your stuff that worked before doesn't work now New features that have been implemented that you haven't had time to create new checks on It's it's a real challenge to try to stay on top of it leverage the OpenStack community I provided a wiki link for some of the communities Thoughts on on what to monitor how to monitor and with what? Don't rely on simple basic checks like service monitoring Use functional checking if if the users are doing something and it's something breaks for them And you don't see it. It doesn't help. I mean, yeah, sure Their services are up, but your functionality isn't there So do your own functional checking you can leverage tempest you can leverage rally There are other tools out there. You can write your own home screen scripts Write a script that launches the VM make sure that comes up all the way Creates a volume mounts the volume and then tears the whole thing down Also log logging Logging is another kind of tough area OpenStack needs a little bit of help I've heard that on a number of occasions here OpenStack needs a little bit of help on logging Centralized logging is key whether using Splunk or using log stash or some other method aggregate your logs build tools around those logs to Correlate events and and understand if something's broken that you can detect it Take advantage of the fact that a lot of information is there and can be leveraged for your monitoring Coming back to your relationships with the help desk or your on-call People document the checks very very thoroughly if you've got a check that's in Nagios or in Sinsi Document what it is document what it's monitoring document why it's needed document what it means when it actually alarms Document how to verify that it's not a false alarm so that someone on-call person is engaged they go in and say oh Yeah, this is really a problem and then document what it is you need to do to fix it If all of those pieces are documented on each and every single check that's in your monitoring system It's going to enable on-call to be able to do their work better It's going to enable your new employees and new hires that you just brought on to spin up faster because they can read this It's kind of self documentation and You can get them into the on-call rotation faster From a help desk from a knock standpoint If you've got a knock that wants to be a 1.5 knock or a 2.0 knock That information is going to be useful to them as well because they can actually use the information to maybe do the fixes themselves If they had that kind of access Alarms can also be very noisy How many of you have received alarms in your mailbox and there were so many of them that you just kind of got Lost in the noise a lot of it may not be That important, you know, they self-clear Fine tune your monitoring system so that when an alarm is sent out that it's actually Paid attention to that it's not ignored that it's important that it needs to be dealt with and taken care of Nobody wants to be inundated by a thousand alarms a day Most of it is just noise talking about capacity a little bit this is a Found to be one of the hardest things to do from an operation standpoint There's a number of different things that you can consider capacity looking at it from OpenStack standpoint You've got flavors, you know, you got a flavor that says it's a 4 gig VM and maybe two virtual CPUs fine OpenStack Allows you to launch your VMs up to a certain point But once you reach a certain allocated maximum that won't let you spin up any more VMs So allocation in OpenStack is very important over commit values tend to scale how far you can do allocations the default Over-commit settings in OpenStack is a little on the aggressive side. I highly suggest that you tune them back I I know in older versions. I'm not sure in the newest versions. That was something like 150% for Ram and 16 16 times over commit for a CPU. I think both of those are unrealistic But Other things that you need to track as far as capacity is not just allocated, but the quotas When users are given capacity in a cloud they tend not to use all that capacity right away It may take them a while to actually use all that capacity. It's kind of a ticking time bomb with regards to capacity The allocation is kind of what's in use right now as far as what OpenStack is concerned quota is kind of what could be used in the future With regards to actual use, you know If you got a VM that's launched with four gig of RAM and two virtual CPUs and they're only using one gig of it actual use may look really small But in when it comes to capacity planning, it's actual use is really kind of a meaningless value. You're more concerned about Whether users can spin up VMs and if if you let things go on for a while If you even have the space to spin up all the VMs that users are allowed to Typically operations teams will close an environment at a certain threshold. So, you know, we'll say 80% full We're not going to take on any more new tenant requests. We're not going to accept quota increases We're just putting a stop to it, but the thing is in in OpenStack That's typically done from the allocation standpoint Users that still have a lot of unused quota is going to continue to spin up more machines And you may still find out that even if you closed an environment at 80% that you still run out of space So do you plan your capacity based on quotas? Do you plan it on allocation? Do you do a mix? It's a really tough thing for operations to kind of decide on and then on top of that Not everything has quotas So ephemeral disk is a good example you launch a VM on ephemeral disk. There's no quotas associated with it So it's very easy for disk space to run out unless you kind of build it into the way that you do flavors at the company Swift is another Object storage is another tough thing. There's no native quota support in OpenStack You have to kind of do the quotas on the swift side with Additional ACLs and kind of build the tooling around that to make that work So all these can make capacity planning more difficult One of the things I think is really kind of It's my personal preference it doesn't mean it's anybody else's but I think there's a lot of value of having a separate team do capacity planning user support kind of Take that load off operations. It's so it's it's a difficult thing for operations to do when they've got all kinds of other stuff but when it comes to Thinking about users coming onto the cloud You really want some dedicated team that can can really communicate to the users find out what the users are doing What if their needs are what is it that they've got coming down the pike? What are what are their expectations for the next six months year two years? How much stores CPU ram they're gonna need? talk to them about what kind of Applications that they're gonna run and how they can run them in a cloudy way so that they're successful in the cloud And then if they're the same team that actually manages capacity and not operations managing capacity It can be all rolled into one so they know how much capacity there is they can see the applications coming down the pike They can determine if you have enough capacity to meet everybody's needs. They could engage infrastructure and Management and purchasing more hardware and growing the environment before you run out of space Give this team also some teeth to audit the the stuff that's running in the cloud Give them the teeth to enforce policy give them the teeth to Come back and say you know you're eating way too many resources you asked for this much and you're using this much Why can't you scale that we're gonna take away some of your quota? Give them some ability to help manage capacity after the fact not just before the fact So I will say we only have a couple minutes left So what I'm about to talk to talk through I'm gonna rip through these slides as fast as I can But to that end I want to let you know that I will be if anyone is interested in what I'm gonna do in the next couple minutes Please find me afterwards or if anyone's really interested I'm perfectly happy to do a little bit of a round table with a group of people as well So I think what I want to talk about next is fairly important because as he was talking about future stake capacity planning Asset management people teams communicating the question is how are you gonna do that right? So it's not a matter of you know, that's nuts and bolts of what you have to do How are you gonna do it? So the first thing that I would say is you really have to take your help desks very seriously I think a lot of times folks in the organization in this role are kind of I don't want to say They're not thought of as being part of operations and integral But you have to realize these are the folks that are interfacing with your users and your customers and your clients on a daily basis They have a very hard job and they're very prescriptive So when they come to you and they say I need something thoroughly documented They do their bonuses their their merit pays their their promotions a lot of that stuff is very intimately tied to How much they adhere to their SLAs and how much client satisfaction they have? So just give them some a lot of credit and work with them You'd be amazed at the amount of knowledge that they actually have about your organization and how it operates And especially how it interfaces with its customers So the next thing that I want to talk about is process of workflow design So when you do go through and you're starting up an operations team or if you already have an operations team You really need to understand your processes and workflows and it's not just a matter of saying oh, okay I need a provision environment. That's a workflow done You need to really understand why you've been doing something if you've been in a traditional enterprise model Your workflows and processes have probably been done very differently if you haven't had automation before if you haven't had an ITSM solution So you really have to be able to break down those older processes and workflows into why were you doing it? What function does it serve? What's the fit gap? Between your modern way that you're now moving toward versus what you used to do So I kind of put in here what I when I go through this process I really look for I identify the process understand it thoroughly plan for it design development and implement and Plan for a phased approach. You're not going to get a big bang You're not going to be able to eliminate all the spreadsheets in one day So you're gonna have to really phase this in and give it a lot of thought You really also want to identify the key components for your For your processes every process will need to have a metric. It's got to be measured the success of the process You need knowledge. It's always got to be documented. You've got to know What to do when something goes wrong or just how who does what and why and then what the outcome is supposed to be You really want to understand That part of your process So one of the key things that I think and this is kind of the meat of what I want to talk about One of the key things that I think is is a big problem in a lot of shops is tool duplicity and the implementation of tools What you'll find especially if you've got an older shop and that you're moving toward a whole new world you're gonna find that you have five thousand tools that do the same thing and You're going to need to call that herd to something that's That's reasonable that doesn't mean that you have to make every team look like every other team But even especially within teams or divisions really take a look at your tool set understand what you're using them for Understand if this was just someone's pet project and there's really something much better out there Just really do a tool audit That's a very key thing to do when you're starting to set up a new operations or just in general to run your IT shop Again do a fit gap analysis. What does the tool do? What it doesn't do a key way of knowing what doesn't do is how many spreadsheets emails and post-it notes are sitting on someone's desk Playing for change and you want to modernize and digitize you want to move to tools that are really going to serve your needs And you don't want to reinvent the wheel So, you know if there's a especially if it's open source if there's a tool out there that you've been leveraging You don't need to go build it. You don't necessarily need to buy a new tool Really understand what you have and don't reinvent those wheels And by doing this you're going to reduce your cost and increase efficiency typically so again IT service management, so what is that IT service management? I consider to be both an infrastructure a framework And I also considered to be a set of tools generally, so It is typically mode one. That's where I would put it into kind of tying it back to our first keynote It's really a system of record and when you think about if you're going to adopt an ITSM solution What it should have you really need to think about system audits company audits data metrics You need to think about incident problem change CMDB knowledge onboarding all of these are part of ITSM and all of this an IT service management solution will give you what he's been talking about But you really need to go give a lot of consideration and understand that it is not a trivial thing to implement I think the ratio of time here on this section versus the other section is very indicative of how it is in most shops actually you get like It's it's a very complex thing to implement so make sure that you dedicate enough time and effort and resources to do this and understand that implementing even if it is a more classic mode You got to understand that this is engineering. It is architecture. There's a lot to these big applications and having them work Well, so don't trivialize your people who are working on that side of the house either And you really want to plan for also implementing these solutions takes a lot of training Stakeholder occlusion is key. It might be the most important thing I can think of Because if you don't have stakeholders and people on board, you'll never be able to implement these solutions So another big thing Near and dear to my heart is integration So the other thing is a lot of people don't understand that once you have an ITSM solution And then you might have your gear on the back end or your remedy or whatever the heck it is you have or a separate knowledge management system You need to integrate these things and you and I'm not talking about message bush integration I'm not talking about low-level send a message here to there get some data and done I'm talking about a real orchestration a real transformative layer that can take your data from one system remain In point agnostic and be able to transform that data add business logic and rich the data Make sure it gets to where it needs to go and then once it's there you can actually leverage that data to make good business decisions So I highly recommend that that's another piece of the puzzle is that you consider integration to be a hypercritical layer And don't assume that that's just something you can shove it afterwards and please please please do not adopt the mentality of doing Point-to-point solutions with everything. It does not work. It does not scale. So I know that from painful experience A knowledge management is another key success factor. It's a very critical to running an IT operation shop Build versus buy decisions. I know that's something especially here very near and dear a lot of people are moving to the open source mentality Which is great? Buy building is also a big thing especially in top-notch engineering shops Well as to build it but you really have to think about what your business does if your business is not providing an IT IT SM solution or an integration solution to the world if your business is not selling that kind of stuff I would highly recommend that you consider not building That and really looking at what the tool does and being able to work with a partner who can really help do this because they're spending They're dedicating their world to R&D on those particular solutions So certainly do that and also expect that you're gonna have to be a champion if you are the one that's supposed to put in The solution you're gonna have to evangelize you're gonna have to be a champion And you're gonna have to do it from the ground up and the top down and you're gonna send emails to everyone Talking about the solution that you're implementing. So that was fast. I will definitely be available for afterwards if anyone is interested next So key takeaways, we're just gonna jump right to the end team structure Build your team the way you need it hiring. You don't need to hire all experts You can train within use a mixed set of training Techniques if necessary Build a relationship with your knock leverage them let them help you But of course as an operations team your biggest thing you need to do is keep the lights on that's the most important job Challenges and opportunities monitoring is hard leverage the open stack community and and let Let them help guide you in setting up a proper monitoring solution capacity planning is also difficult You measure everything collect metrics on everything Again build the relationships with your knock in your user community. It makes things better in the end Tooling try to reduce the number of tools that you have reduced Duplicate tools but recognize that you may have to have some because of the way things work within the company And then finally automate automate automate that is a thing. I think everybody can benefit from So since we're out of time Since we're out of time We will be outside for a little bit to answer any questions if you have any We would be happy to help answer any questions. Thank you for coming