 Hello there. So in the following slides we will talk about a series of problems that actually we encountered in a set of different companies in which the technological part was not really enough to solve the issues. So I am Andrea and I work in Red Hat Switzerland as a consultant. So basically we always talk about cooperation and cooperation is definitely an essential part for any kind of successful process or project. But most of the time the word DevOps is actually abused because so we try to basically throw technology at the problem without solving it in a more deep kind of way. So this psychological aspect is often underestimated and in companies in which we are very old school and there is no cooperation whatsoever. There is a lot of garden keeping and there is no mean for automation or they don't want to automate. There is a lot of fear of changes. It's actually an interesting playground when you try to do all these things because it's basically a lot of stuff it's kind of recurring in all of those companies and basically what you always see there is there is a silo organization in which there is no sense of collaboration whatsoever. Changes are impacting the teams. Documentation is not there. There is unnecessary complexity because there are an absolute lack of standards that the processes are very complicated and most of them are over engineered on purpose and there are not even coding standards that are set. So basically there is no kind of common ground across the teams and there is lack of automation and it's normally because not only there is no software to automate but there is no will to automate and manual tasks are still you know it's something that still feel comfy for for most of the people and there are a lot of technology gaps between the teams. Usually these organizations work are divided into separate teams. It's dedicated to a specific subject of technology. These often there is often silos which teams they don't they don't talk to each other basically they work doing their things for their own department not caring about other departments and how these changes will impact all of the departments and basically when when they make a change and this breaks something in some impacts other people then this generates a reactive culture where the changes are addressed after it happened instead of before so so the other teams they have to adapt the process scripts etc after the change has happened when it's too late. There is also a general usually lack of standards for automation standardization is crucial you cannot automate without having standardization because it will be too too way too complex to to create a program that handle the complexity so we need standards and naming conventions and they must reflect all information needed for and for this and if changes are made they must be communicated in advance so that people can adapt their processes and programs to these changes. Without enough time there is no again going back to the previous slide there is no there's a reactive instead of proactive actions. Usually also there is a lack of clear communication sometimes I mean we can define communication to inputs and outputs what do you expect from me and what do I get from you kind of so so usually we found that these inputs and outputs are not for a technical part they are either not well documented or outdated so maybe something happens and they didn't the team responsible for that didn't update the documentation so the other teams they they found out when the things doesn't work as expected also it happens with processes sometimes you need to run after people see who knows what are the steps the right steps to do certain things so so this is also a big problem for automating there is also lack of feeling of ownership some many departments they will think okay this is my job and I don't care if I make a change to my area I don't care if it's gonna break somebody else's script it's not my problem and this is very bad because it just makes a lot of people frustrated because they will they will I mean they will see their their processes break and sometimes they get management attention and then we can enter into the blame game who is who is responsible for this and so and so so all this again feedbacks into the silo problem the people become upset with each other let's say and then they don't want to work together one of one of the main issues that we actually saw in a lot of those organizations is actually that the fear of changes so basically even though you know the engineer the engineers might have will to change but basically when they get into the problem and they actually need to change the stuff and they basically work a lot with specific technology the problem that they found is that basically they don't want to change it because they're scared that that technology could replace them so even though they have they do have the technical background they don't want to touch it because simply they're scared of you know what can possibly happen and even though business strategy is like automate at every cost everything the the the problem that we normally encounter is that stability is weak so basically the servers are being you know manned manually there is no automation there were no standards again so basically you know like the controls are at the bare minimum so basically you have people messing around with stuff they don't go through their held up account they they have like you know a common root account whatever so you can't actually track what they do priorities are not set because of lack of communication again so people don't communicate to each other they have a lot of stuff to do they just do it one thing after the other there is no you know no organization no no kind of common goal so stuff is going to break and in this situation it's extremely hard to test and that's something that you definitely want to do because without testing you don't know if that stuff is going to work and the the problem is that the first time they try to apply automation is that you know you have something not not automated and it's messy you automate the mess it starts to be an even bigger mess bigger mess so basically what happens is that um some people are very good at it so they start to automate break things don't communicate to the other the other ones don't know the new technology they don't put the ends on the new technology they can't fix it so plus there is a fragmented ownership because you know that guy on that desk knows how to automate knows the stuff so he's gonna do it if he gets sick nobody's gonna touch that stuff so if it's if it's broken it will fix it when he's back and um yeah as I said automating this kind of in this kind of messy environments uh it's just getting more messy this brings to uh basically to an evil circle which in which the company needs something the engineer engineering received the request it starts a sort of a manual process that is off manual all automated no one no one knows exactly who is doing what and then when this process is done stuff needs to be checked it's probably broken and he needs to be redone again so management gets upset and the evil circle starts and so how did we solve the challenges uh on both um human side and technical side collaboration and openness is nice but unfortunately we humans we get moody unstable sometimes depending on the people so there should be some technical means to to prevent human errors and also to prevent accidental sabotage by lack of interest or just because somebody doesn't like some other person and so on so what we could use to solve this technical challenge is um so we have Ansible Tower we we use it very often for um managing playbooks I mean it's a good platform for engineers to work together uh we can manage credentials without I mean we can let the other engineers to use some credentials without letting them know the credentials themselves they just they can just use it but they cannot see it uh we cannot so use cloud phones for the UI it's very easy to prepare intelligence service dialogues and can be simplified tailor made to the user experience so if a user is from department X and it only needs to see certain operations it will only see that operations and nothing else at the same time other people from other department will see other things and of course we should use some infrastructure to test mean to to do automated testing this is to make sure that when we do changes they don't break the existing the existing workflow so automation is actually very good but um we you all we should always make sure that is well planned and how do I mean with that is that basically if you start to automate in some somewhere in which you have no automation no pre-existing automation you have to plan what are the tasks and you have to have a vision on how those tasks should look like because if you don't have that vision you're going to do just something that probably is not going to work together with with the rest of the infrastructure so targeting solve a problem go to the next and automate whatever possible uh basically I mean the aim would should be that you touch you press a button everything spins and everything is automated because you definitely want to go out of the blaming circle so you don't want to have uh you know engineers that touch the things and they don't know what they're doing and then they break it for another team and they maybe use the same logging credential to get in so automating would solve this problem so basically the mindset should be there either either is automated or it's it's not happening so um Ansible in my opinion it's a great tool for that um because it's very easy to onboard people so if you compare it with for instance puppet which I mean we're not comparing apples with apple because uh puppet is a much more complicated thing but Ansible is very easy to onboard because it's just you know it's just some bunch of yaml files there are a lot of um a lot of uh prepare packages on the Ansible galaxy so it's it's it's a great tool for that and um well the the most important thing is that we have to do is basically convincing people that the best thing to go would would be automate test check repeat and this this is normally gonna work pretty well but it takes a lot of time to to actually um make it so uh we we used to have a demo but unfortunately internet is very slow so basically every time we click it takes about 10 minutes so I guess we're gonna skip that because there is a lot of clicking around and that's a bit of a pity and yeah so another important thing is uh to provide a simple user interface so the I mean if people can request what they want with a simple user interface instead of having to send emails or fill forms I mean long forms and contact people and so there will be much more happy and willing to to to do it I mean to use it and also one thing is that when you create a user interface it needs to be uh difficult for people to to to do work around so you should use you should be using like uh verify the inputs and also providing letting the user choose only what they should use and not everything else is also prevents them from selecting the wrong thing and having all wrong and then you know going back to to step one um keeping it simple is also very important because when things get too complex people especially people who are new to it they will they will become just uh I don't want to I don't want to do this I want somebody to do it for me and yeah well as I say validate the inputs don't leave rooms for mistake there is one for them all but adding something to to the to the to the to the above it's like it's basically if you make stuff very complicated it's like uh when you try to uh close the cat in in the room and then you start to put small even smaller uh you know uh crates to to make the cat pass it will it will go around what I want to say is that if you make stuff very complicated people are not going to use it so the simple it is the more people will be will be using this uh this and one very important thing is that every step you do no matter how simple it is it it it has to be replicable so basically you you have to um you have to do like some kind of automation that helps you replicate the steps at any point it is expensive because it takes normally a lot of time uh it scares away people at the beginning because if they're used to uh just do the stuff normally when they start to automate they say ah why should I spend half an hour to automate this because I can make it in five minutes uh the problem is that what we try to make them understand is that basically those five minutes replicated for the entire data center which you have maybe 500 servers is much more than half an hour and the testing of all this is is is expensive too because basically you have to replicate your infrastructure so now you have tools like uh creo or talker or vagrant boxes um that you can you know easily spin and destroy it's never gonna be exactly the same kind of thing that you have in the data center but it's it's really it's uh can can get pretty close and it's absolutely necessary because when you test a lot you avoid the mess so uh as much as we can approach the problem from a technical standpoint if people are not willing to to do it I mean there's no nothing you can do so you need to change the attitudes of people so basically yeah I mean throwing technology at the problem is not only the is not the only solution you have because uh there is something more needed because basically if you do not communicate if you do not do not build standards if you don't make people feel responsible and that's very very important uh nothing is gonna happen like uh having the people being accountable for what they do it's actually the most important thing you can do in your company because when people feel responsible for what they do they try to care about it and if they care about it the performance are growing sky high and um standardization helps a lot with that because if you find a common ground among those people uh to collaborate and you know being able to read what someone else writes uh actually helps a lot and standardization is indispensable for this and open communication is also something that is very very important not only in terms of you know like the buzzword of the open communication framework but really being able to communicate with the team is core it's absolutely important because there is no agile process there is no DevOps if if you do not communicate and this in this kind of companies it's it's basically the main issue you go to a company in which you know the infrastructure or the organization is very tight you can't actually talk with the people uh you can't talk with your boss you can't tell your boss that you're failed that you failed and that's that's very important because if you can do that then the stuff gets much much easier and so to have a clear structure process is helpful because it avoids misunderstandings um so people should know what they they should do and uh and so people should participate in this in defining these processes I mean when it's uh when things are not clear is the worst because people just can divert and then you you get a single process becomes something that it depends to who you talk with yeah standardization again uh when when you define the standards you avoid ambiguity and if you avoid that you um you avoid the guesswork so if everything is standard there is no reason to argue because you can just follow what what you said is a standard and it normally guarantees a very high quality and very high productivity and if the productivity is high the morale will be high as well so people will be happy because they see that the stuff actually works and it works good because having stuff that doesn't work is extremely frustrated frustrating well the problem is who owns the topics and how do we deal with this ownership problems so one simple thing you can do is to to see where I mean where the complexity is so if if the I mean if to generate the inputs you need to have a lot of inside knowledge of the product or area and this inside knowledge is only on for example here on team b then team b should take the responsibility for this part but unfortunately sometimes it's not so simple so well sometimes there are multiple teams involved and then what you can do is just to to split into different in the different parts so for example if you use ansible you can create different labels for each purpose and that's it and it's very important to be nice to to talk with people and not be mean but to other teams because this is this happens often that again the human part some people don't like other people and then they try to kind of sabotage so it's important to have a good environment or at least somebody who is making sure that to diffuse all this animosity between the teams so at the end how did we implement that how do we normally implement the CICD model how can you actually test infrastructure and avoid failing when you go in production so that's you know it's just an example but I found out that it actually works pretty good so for automation you know we work for that so we use ansible and ansible has this very fancy thing that is a sort of a GUI that is called ansible tower that actually can automate automation processes and in there is the concept of the team and you can actually connect it to your LDAP so you have an LDAP directory you have different teams those teams have different permissions and what what I normally try to do is map those permissions to the same permission in in the git repository so so that each team owns is on modules can contribute to others but you always know who is touching what and that's absolutely um super important now when you write ansible roles stuff can can get pretty complicated to test because normally you know yamal linting is not enough so what we use is molecule molecule is a python library that which is extremely cool in which um in which basically you can define what kind of test the infrastructure should do and it uses um it uses some kind of testing frameworks that are written in python in which you can actually assert what you did example you're installing nginx you want nginx to listen on port 80 and you've write a playbook for that so basically what you do is writing a python script that says connect to this machine check if port 80 is open and this if this happens and your guys can actually use it on their laptop so we normally enforce it with a you know without with a post commit check but a pre-commit check but the cool thing about it is that it's written in python uses the same libraries as ansible so basically you can test on your laptop whether this stuff is going to work or not and the cool thing is that it works with docker and it works with vagrant so you can actually test on both cases if you're using containers or virtual machines and um well the the other tool we use quite a lot is Jenkins Jenkins is pretty much everywhere so I believe that most of you are confident with it and what we use Jenkins for is basically for triggering two different things one is satellite that can be connected to rev or whatever to spin linux virtual machine dual testing try all the roles try all the playbooks because molecule can't test playbooks can only test actually actual roles so basically to to check everything we we spin up a virtual machine we test all the all we wrote and in the ansible roles and then we use molecule to check everything on docker so for sanity check so that basically each dock at each ansible role is that it tested individually and then all together and there is this one tool that uh not a lot of people I I found out actually know it's it's called service pack that's a great thing in which basically you can define how your machine should look like after you do automation has been written by the guys at sef uh no sorry uh chef the the other automation platform uh can work pretty well with puppet too in one of our customers we had both puppet and ansible so basically what he does is describing uh how the the the machine should look like so you've write your server specifications you test all your playbooks against this you take the result and you check with service pack if it's what you were expecting that you get so basically you can loop in this thing and try to find out if all the servers will be on the same state because ansible is not a state machine so it doesn't enforce the state it just applies the playbooks one after the other it's that puppet um uh generates a state so basically it compiles the state and then applies the whole state to the machine and in these they are substantially different so sometimes it's very easy to use ansible because you know it just does the stuff one after the other but on the other end you can't really check if what you get is what you were expecting because it's hard to test this guy solves the problem because you can actually generate uh gathering facts and check how the server looks like and check if it's what you were expecting so in case in which uh you also have like something else running there like puppet for instance or chef that is a pre-existing uh automation model you can still use service pack to compare the results so it's an extremely powerful thing and you can integrate it with um with a ruby c i c d software that is called kitchen c i unfortunately it's not in the in the schema but you can use kitchen c i to automate even this so the result gathering and the test with service pack so if everything works service pack or kitchen c i in that case or Jenkins or whatever you're using gives you the thumbs up it's fine ansible ansible core which will be triggered by ansible tower will go through the server farm and apply all the changes like this you preserve the ownership of each of each role you know we're touching what the checks are enforced on your laptop and after the commit on git Jenkins will take care of the rest service pack will check that what you did is actually what you wanted to do and then ansible will do the rest and these i found out that actually i did it for initially for one customer and i figure out that actually it's pretty it's pretty good it's a pretty general model and it normally works pretty well so try it out just one one one thing is that many ansible modules that they were stateful and you can also do testing into the into the playbook so well what what did we learn for all this first technology only it's it's not going to solve the problem so before you apply any technology change in a company especially old school companies because you know if you have 10 people in your startup you normally i mean you should be able to talk and come up with a solution and technology change won't fix anything if there is a culture if there isn't a culture change within it and to achieve results you have to achieve both you have to to kind of grow the culture of your company with the technology and normally using an open collaboration framework can make technologies much more effective the the open collaboration framework would say that is a is a system for you know innovating now with this i'm i'm not saying that you should go by the book and just follow what they say in the in the open collaboration framework but having a look at it it's it's pretty good because there is a lot of good ideas to implement stuff also investing time in making things goes move between the different teams because again if people don't want to work together it's very difficult to achieve things i mean there will be a lot of sabotage and also i mean you when you go to work for a company you you want to have a bad time right i mean you want to to enjoy what you do and so on so it's it's good to to try to make people settle their differences and so well to settle those difference you have to you have to be fair basically if you want so what what we're seeing is that uh if if the company or the management wasn't fair with the employees they were they would have sabotage it so if you want to avoid this you should be kind of fair with with the guys are working with you and again try to automate as much as possible because if you automate your task and and not only your task i mean you participate in automating the whole process then you will have more time to do engineering so i think most people don't enjoy doing repetitive tasks i mean most people are in this job to to create new things and do i mean think and create solutions and not doing copy paste and things like that so if you if you automate you can you can focus more on this kind of work instead of repetitive things and so the if you leave the you make your users easy to request your services without having to contact you and send emails to people and so on i mean if you also make it easy with a user interface also you will get less disruption on your daily work so saving times automating means that you don't only have more time to build better things but this time should be invested in training because even if it's i mean it doesn't need to be like you know that you go in a super expensive training or whatever but gathering the teams together and having them talk about what they're doing and the technology they're using and all the kind of stuff they do even if it's you know completely unrelated it's normally very beneficial for everybody because again uh you can't achieve any kind of goal if you don't have communication inside a company and uh this is uh something that uh in my opinion uh was like the most important learning in in all those companies we went through that if the people are not nice to each other if the people don't want to share what they do have people do not communicate nothing is gonna happen and the you know like the stuff will remain dead basically thank you a lot and do you have any question uh well some of them yes i mean it's it's not a specific company it's it's more like you know a generic kind of thing but uh some some of them they try some some of them they actually claim to be agile but in practice they are not i mean it's just for the checklist they basically use gyra that's as agile as they get any other question you talk with the people it's so a lot of companies they so we have some frameworks in reddit to to talk to people and um but uh normally uh companies in in companies like i found very effective workshopping stuff so if you go there and you want to to pitch your idea showing what is your idea it's actually very very effective so just you know gather people together try to workshop something uh well if people get along they get along otherwise you know you can't change people mind that much uh but uh normally you know being nice to people and uh try to try to pitch your ideas is actually an effective way or that's what they found to be effective usually when you are a consultant you go there and you start meeting people and you start getting the feeling of how they how's the politics inside the company basically and based on that you need to well you can talk to people you can talk to management and try to make them understand that if basically if they don't work together this project maybe is going to take ages and it's not going to work at the end so basically just talking and and seeing observing what happens inside the company any other question sorry say it again well that's a tough one so i don't have i don't have a very you know good way to to actually motivate people or i don't have a solution for that i don't think there there is a thing like one size fit for all i think it depends on people uh again again i think uh you know like investing on their education for instance is one thing so i like a lot to be you know to to learn new things for instance so i'm very motivated when the company allows me to um get more knowledge another thing could be you know team building events this is also a great thing i mean some people like it some people don't but at the end uh i mean i have pretty good memories of companies that made made us do like some kind of cool team events i also i mean um you can let them get more involved and like get them to participate if they want i mean allow them to be able to raise their voice and to to feed not not just uh i mean like if you're a manager and you are like this has to be my way or no way this this will make people like very like very demotivated so if you make more make them buy more more participative between all the people then i think they most people they will feel more motivated to work anyone else um so what uh what i normally do it's again taking like you know the responsibles for for the team get them together in a room make them stick the red together come up with a standard and again it's it's all about talking i mean at the end you have to communicate the things yeah definitely i wouldn't say blame but just let them know that is not right i mean not too much because if you if you i mean if you blame people too much they at the end may they may feel like oh this is not for me and you know they yeah they won't they won't get to the learning curve i mean they will pass they will pass so normally what happens is that i have to lower quite a bit the the strict the strictness of the checks but you know like at least sanity check and this kind of stuff you know this this is easy to do and again uh i mean if you write ansible code i firmly recommend using molecule because it's it does does basically yamalinting ansible linting testing and all this kind of stuff it's pretty good at that um yeah you you always have to set the bar at a certain point because otherwise you know people go crazy uh it's like when you write c code then you uh you put dash dash warning goal and pedantic you're not gonna compile anything right so i i normally speak with them very clearly and i tell them how much time do you spend a day to do the same task wouldn't you rather do it once properly and uh one thing is so and this is the nice way there is a bad way which is enforcing that everything is not automated it's scratched by the automation itself so like puppet for instance is very good at this because you can't define look the machine has to look like this if you touch it elsewhere puppet run scratched again and you also need to have the management to be on board because if they i mean some people will be motivated but all people will think okay i'm management is pushing me to do this and i have to automate i don't have time so there has to be also management support because at the end you know they control last question anyone so it's i mean in my experience it's normally the network team that has uh that doesn't want to automate much because uh it's very hard to test for instance i mean if you want to apply rules on a firewall and you do it in an automated way the rule is wrong you kick your company out right so i i normally advise to start with small things and if it's very a very difficult area like networking for example firewalling then it might it might even make sense that they keep doing some stuff manually because they want the double checks because if it is if it is really uh at that at that particular area in which you can get uh pretty bad you know like um it can get pretty bad if if you uh automated wrong i think that um it's probably easier to tell them like look take your time but do it proper of course i mean you should you know push them a bit but uh in some specific areas i can understand that it's actually pretty complicated yeah it's um it has to be i mean management has to consider that as the long-term goal to get everybody to be used to automate and also i think it's useful if there is a dedicated team or or yeah to be like um educating people on the automating and also like make them i mean make make people understand that i mean normally what i see in it is that everybody's super loaded with things and uh automating actually is not gonna replace your job because it changes constantly and if you it's better if you are like less stressed and uh you you can like do engineering as as i mentioned before then if you are manually doing things and you know it's i think it's it takes time well i guess we're done uh thanks a lot again thank you and uh have a good lunch