 the record and welcome everyone it's time to start we'd like to start on time thanks very much this is a Jenkins online meetup today we have the privilege of having the team mobile development team present to us on poet their pipeline system that they use to deploy to thousands of developers at t-mobile so martin would you like to take it out to lead us off first we'll let you and ravi go back and forth yeah sounds good thank you mark uh so just everybody and we'll just introduce ourselves here what we're going to do today is we'll go through also just talking about what the vision and approach we had for this pipeline and our strategy what motivated us to end up where we were and then ravi and larry will take you through some of the high-level design and the implementation then get down to kind of the core of the engine it's uh so it's kind of those three areas of just here's here's what led us to this point where i was feeling motivated to adjust and put things into how do things a little differently than how pipelines were traditionally being viewed at t-mobile here and then ravi really was a key person for helping us drive adoption and working with our teams at the time but he's now you'll see a manager title uh he just had that role uh just a few weeks ago in another team you got that opportunity which is great but really the folks on working with our customers and understanding what their needs were so that kind of helped feeding our product pipeline as we continued and then larry really was the guy at the heart of the engine who spent a lot of time putting the the core parts of our engine together from which then everybody else was leveraging it so as noted i'm martin crinky senior manager of product and technology at t-mobile uh ravi sharma he also as i noticed he is now manager of product and technology at t-mobile in one of our other pipeline teams but uh he was the head of all of our he was basically product managers i called it product manager for customer support it was so important how we looked at customer uh support and how we talked about our customers and viewed our customers and listened to them that actually we had two product managers one for the heart of the pipeline activities and then also somebody just focused on customer and we found that paid off quite well and larry as i mentioned was really our guy for you wanted something technical built i go to larry on on that so larry was definitely will be the guy under the covers that did a lot of the core work all right so really the kind of where all this came about was when we talk about managing a c-i-c-d pipeline there's gonna be a lot of complexities right just building and designing that thing and especially if you work from a jenkins perspective right you have all your plugins you've got to get if you're on vm's you're keeping those things going you got to get them updated the variety of types of deployments people want to do they want to start doing blue green especially we start moving dev ops boy we want canary deployments there's a lot of other workflows that you need to integrate so all of a sudden you start doing more and more of this and your pipeline start becoming more and more customized right and operationally you want to make sure everything's up and running and you are what if i need should i be running five pipeline five pipeline engines over there you know five jenkins masters and people are working off of that uh or do i do i have one massive centralized one because then all that means i have fewer people maybe or i don't have to be spread out and what my focus is and then from a customer sports perspective uh you know you just had all these other things which i will say a lot of times the customer support is overlooked a lot of times and there's a shared service organization it's often you know as much as we think about the things that we've got to do to provide tooling are you really thinking about your customers customer and then of course if you're 24 seven how are you going to handle that we have sufficient documentation how are we training people all those uh important and at least our experiences here at t-mobile had been that you know especially if somebody was trying to build more of a centralized one or capability that other teams could use this this category was really lacking and if it's lacking boy everything else kind of tends to suffer and frustrations mount around and it just becomes a challenge for everybody whether it be the users of the platform the people creating the platform and then management is trying to make sense of do they continue with what they're doing or to go a different route so this was kind of kind of overview of different types of things that i was noticing and we certainly were dealing with in the work that we were doing within the teams we were supporting at the time all right so some of the key visions on the the pipeline so poet actually our team was is the pipeline's operations environments and in technology or tooling tools we're a shared service organization that works a lot around either sre type functions system reliability engineering other things to do to help streamline a lot of the applications development and practices for teams maybe that were in in our front line you think about that in our retail stores or our care these would be the folks when you call in and need some customer support things like that so those are our primary focus but then we also go beyond that and help other teams as appropriate and as as requested and so as we look at us we really want to streamline the development pipelines to using containers the idea here that rather than writing lots of different custom codes and all that let's create a library of containers a doctor containers that can provide an executive variety of things and we'll have some examples for you this coming up another goal here was developers like to have their flexibility right you want to give them that engineer he's but I got this other thing there's always a reason hey I can't quite do that because I need to do this one other thing and you don't have that for me so we really want to try to give them the flexibility and adapt to the model that they were using in their development so not be too prescriptive we really want to avoid that it's usually a downfall of a lot of shared service solutions and things as when you start being so prescriptive that people just feel they don't have that flexibility and then again a key thing here is let them focus more on the development and testing aspects of whatever it is they were building of their software rather than spending time on having to maintain their pipelines this wasn't interesting this just came up recently but our senior vice president made a comment recently that everybody was wanting to show their pipeline he says I talked to anybody go oh you got to see our pipeline and you know that and he's like I just I don't know why and well one of the things I was I talked about was you know why have they been wanting to show their pipelines to them and it's really because they were complex they're very proud of them they spent a lot of time working on those pipelines and so they're like hey we want to show you this because this is really cool but our take was pipeline should not be that big a deal it should be just oh yeah I have a pipeline we run our stuff we can deploy it we build all the things we need it's a piece of cake it should not be that big a deal and so this is really where we at team mobile started to shift poet pipeline and some other activities have been happening I've really been trying to move away from this idea that every team is having to create their own pipelines because it's not efficient for the senior senior VP he has I don't remember thousands of people under him it's not economical really for him if every team is building their own pipeline and trying to do all the customer so what are the other things here that we then want to do is let's set some principles for how we're going to work on this and we do it for both of our team we have guiding principles for the team and how we work as a team but also then what are we going to work guiding principles that we're going to check ourselves against as we did this pipeline and built this out and first of all was like look let's make it easier to implement new capabilities we don't want to really have an impact to other users and what does that mean well if I go when I write all new code in my engine you know I've got everybody's including in a Jenkins file and they're running this code and all that and then I make changes to that library I do a bunch of stuff I could be impacting those folks they start loading that one in and you know there's a lot of other testing that has to happen if I'm upgrading plugins do I know all the plugins work I had one person told me that he had come into a company and they had over a hundred plugins running in their Jenkins instance well you know version to version right those plugins can change and we had a pipeline not as our team specifically but here at t-mobile there was a pipeline that was trying to do more of the global hey all you folks can come and use our pipeline and they literally were taking two months to migrate from one version to the next so that's not really getting to what we're trying to do when we want dev ops and continuous delivery and you know whether it be from my app or my actual platform that I'm giving you that platform needs to be able to go under those types of methods as well so that was a challenge that we we were very concerned about wanted to make sure that when people come in we want to get somebody onto the pipeline we want to help you want to make the usage and the onboarding faster and easier we're going to be really conscious so again it gets about that customer right so let's think about our scalability our reusability our flexibility for the dev teams hit all those those marks and so again these are things we're checking ourselves against as we went along as we iterated on this and then we didn't want to have this what we call CID CICD pipeline specialist we did some numbers on this and some calcs that as to how much money were we saving by some of the things and if you get down to it any given team if you're in a larger organization and people have their own pipelines you end up having to have somebody that's kind of dedicated for that pipeline they may not be 24 7 40 the hours a week always doing it but when there's an issue with a pipeline they have to drop everything and be right on that so that becomes a very disruptive and I think to your other planning maybe other work you'd like to have that person working on going back to that whole concept I've mentioned about us actually trying to let people focus more on the development of code so if we can give more predictability it's hard to find the specialists all those things so we're like okay we want to see if we can start to reduce that reliance for teams on that and that'll also save them money and focus to where they want to be and then again can we actually abstract the underlying technologies that drive the pipeline out so we yes we use Jenkins under the covers Jenkins does a lot of stuff for us that could just give us foundation that we could start on we didn't have to do everything from scratch there is a value to some of these key plugins somebody's got a plugin that lets me get to get the different things I need to send something to Splunk whatever it is I have these plugins I can leverage but let's minimize those minimize them and let's not have teams actually even have to worry about knowing we're using those so those were kind of our key principles on that front all right and so I'll just I think this is my last slide right here what I just want to note was if you're a shared service and you are supporting multiple teams with pipelines and all that remember it's not just about the technology you can build that really cool awesome mousetrap but if you don't focus on the customer you might get theme here it's customer customer customer if you don't focus on that it can be you'll have a real challenge you can think you're listening to them or it's very easy to get into yeah but I know what's best for you well listen to what a person has to say give them that chance to provide you that feedback and that's really that third bullet is where we did this we always gave a customer satisfaction and we were running at about a 4.9 out of five on a customer satisfaction with customers it was a set of questions we put out we tried to be very open about the questions we asked we didn't try to be leading with those questions sometimes people put questions out and they're a little bit leading with assumptions under them but it's just something to think about that as you do this that you can do great things with technology but if you don't remember who your customer is the development teams folks like that yeah you know then you might have a little bit of a struggle so with that I think Ravi do I turn it over to you yes thanks Martin so hey I'm Ravi Sharma I will actually walk you through with the the current physical architecture of the pipeline which we have in T-Mobile and then I will walk you through with the the different features and the capability of the pipeline what Martin actually just explained is basically there are good things and everything what he talked about is about the customer actually and we are really very customer focused so whatever we have designed so far like whether it is a physical architecture of our pipeline engine when I say pipeline engine it's nothing but I'm talking about the Jenkins here and the pipeline library which is our pipeline framework which which we have developed is so whenever we have designed any of these components we have kept our customer in the center of the table and see that whether any solution we are designing is actually helping the customers or not because in the DevOps team there are certain questions when you when you design a pipeline library in your organization they have certain questions when they start using the library itself is how easy is to do the onboarding onto the library basically so if you have a new application you have a centralized team in your company is that taking care of all the onboardings of applications in a company or whether being a DevOps team can the team take scale by itself and how easy is that and how much learning curve basically is required when you understand the pipeline and another one is basically so these are the actually our experiencing from last seven years we have borne our hands with a lot of like you know different ways of working on Jenkins and creating the library and that's where we have come to a conclusion that this is the best architecture and the library we can actually design do we really need a dedicated resource how we can extend our pipeline with the features and the capability if we need to add into our pipeline there are a lot of duplicity happens when you have the microservice kind of model a lot of duplicity happens there because in the microservices you are having following the similar methods of you know building the pipeline testing the pipeline and deploying on similar platforms so we have the similar Jenkins file which we need to repeat over and over again with all the components can we can we have this duplicity can be avoided actually so there are certain questions when people ask to you and so one of the things which I will talk about right now with the physical architecture so what we have basically done in this we are actually deploying deploying our Jenkins basically into the Kubernetes Kubernetes namespace the Jenkins itself is into a into the container and then we are we are leveraging the Kubernetes as well along with it so as you can see in the picture here the master itself is into the container and then we have a persistence volume of the Jenkins home which is right there on this on the on the Kubernetes space and then we are leveraging basically having only few very few Jenkins plugins basically you know the plugins are basically the heart of the heart of the Jenkins as more and more plugins you keep adding into the Jenkins you get more and more features but in this case we wanted to have a Jenkins which which actually we have less number of plugins just to support our pipeline so we used only basically four plugins to maintain into our pipeline engine here and then we have step containers so usually all the step containers which are basically build step container these all are the build step containers and different components in your build file which you are going to use they spin up into the same into the same Kubernetes space the what else we have actually done is here is we have used the Splunk basically for the logging we have used AppDynamics spring cloud config server basically we have used for our encrypted passwords to store into the spring cloud config server the reason being doing it is this whole setup is actually fully automated so fully automated when I say we are using the Jenkins config as code we are using spring cloud config server so when the Jenkins comes up it takes like two minutes and some seconds to bring a new you know a new instance of the Jenkins for us and we are doing a horizontal scaling so as of today we are actually having 28 different Jenkins engines running for each team with the organization wide and then we have what we have done basically is a backup utility to rapport so basically Jenkins Homes stores everything for you right and in this case we have made Jenkins Homes itself is a git repository and every every few minutes the job runs and it takes the delta backup into the bit bucket or the version control tool so by using that what happened is if everything if your Jenkins instance goes away by any chance you can actually restore it again within the same two minutes or few seconds because first of all the loading of the Jenkins is very easy for us we are using very few plugins into that and then all the configurations are stored into bit bucket and the version control tool so it's easy to restore them back now we talk about the agents so which are workers for the Jenkins so they can be spinned either onto the same Kubernetes space or they can be actually done onto another Kubernetes so for that we are using the dynamic allocation of the agents and dynamic provisioning of the agents similar things we can do onto the AWS so in our case we are using both of them and at the end we have a Grafana dashboard so whatever happens or all the steps when you execute in a pipeline the all the matrices are actually getting collected through the Grafana dashboard let me so here is the few details about it so the complete infrastructure is on Kubernetes including master on agents each team actually gets their own pipeline engine so that if something happens like they are isolated teams basically so anything happens to one of the masters the other teams are not getting impacted in this case and for us it's very easy to spin any any number of masters on the Kubernetes space so we are not worrying about those things Jenkins configures code is highly utilized for us because we are using very minimum number of plugins and they are all pre-configured even we have like global credentials within your organization like you have some global credentials which can be used across the teams but there are specific credentials which team would like to use for that we have provisioned folder level access so the folder level access given to the team they usually have the create the credentials which are specific to their applications within those folders so that nobody else can actually see them and it's a SOX compliance perspective as well using core four core plugins you can have as number of plugins you want actually and I remember before coming up with this particular architecture we had 200 plus plugins and 200 plus main plugins and you can think of the the dependent plugin which get installed so when I say four core plugins basically these are the main plugins you have some dependent plugin also get installed along with that we have extended it to 16 plugin because like we are using like Splunk as well in that case so one of them somebody is looking for build times in because if you go on to the UI you need some plugins actually so we have extended it them to 16 plugins into the mobile we have removed a major dependency actually so by storing the credential in vault so what happened basically is like you know we have our credential stored right now into the vault and through that we are using spring cloud config server so all when the Jenkins comes up and up and running it actually brings everything from the from the spring cloud config server encrypted credentials and gets them stored in the global uh global credential um in Jenkins when I say almost zero maintenance and that's a huge uh it used to be a huge pain when you have a single master actually maintaining the plugins upgrading of the plugins actually is a huge task for any organization and if and and I have actually faced that a lot so by but going by number of plugins when you when you don't use more number of plugins and you have like team wide Jenkins it easy for us and the whole process is highly automated for us and that's a really help us to to do the zero maintenance to come to that point so Ravi yeah mark wait here would it be okay if I injected a few questions that have arrived that are specific to the the things that you've described so far sure absolutely okay so so one of the questions is are you using Jenkins job builder and my assessment was you're probably not Jenkins job builder didn't look like it was involved in in the structure that you're doing no okay and are you using cloud bees Jenkins are you using cloud bees core the product or Jenkins the open source component to do the the Jenkins instances this is open source Jenkins thank you okay great uh thanks very much and I'm continuing to gather other other questions will I'll disrupt you periodically thanks again for letting me interrupt thank you okay so now I'll talk about the pipeline execution example basically how the pipeline looks like and with this design so as you can see in this picture there's a the user who creates apple request once he's done with his task on the on the source code and the source code in our case is basically into the big bucket right now we have something called pipeline definition file which is a YAML file where you store all of your steps steps in the sense like what all different steps you would like to perform as part of your pipeline you are doing some pre-built steps and then build step you're building with different tools and technology or a programming language you use and then you you go for like notifications sonar cube or testing and then deployment right so all the steps are mentioned into this pipeline definition file now the picture you see is all the clouds and the containers here each of the step is a container here in this case so you can say like about the build java 8 and may 1 is one of the container another slack notify which is again a container here sonar cube qcp notify and then 45 deploy to k8 k8 is nothing but it's a Kubernetes uh container and then the last step if you see is which is influx db logging so which is basically the grafana grafana logging with you so this whole pipeline steps run each of them runs like a container and we have something called pre and post which Larry will explain you in detail so basically the once the pipeline gets run everything gets logged on to the grafana and it creates beautiful uh beautiful graphs actually onto grafana for us related to the pipeline itself CICD now the picture which is down here which is poet shared or use reusable step container library this is something actually we have created on our own but there is an open source code available for this so this step container library what we have done so far is we have 40 plus different containers available for T mobile users and these are very generic containers created on different tools and technologies and the users can extend them they can use the containers which we have created as a base image and they can extend those containers for their functionality into the pipeline now in this picture actually if you see it's very easy to extend a pipeline now for a user or for a DevOps DevOps team basically if they have to add something called jmeter for example i'm just taking an example of it or they are actually deploying to a different platform which can be AWS in this case they have to just create another container which is we call them as a step container and they have to add into their pipeline definition file and it will extend the pipeline automatically and that's how easy it is actually to extend and you know make sure that the pipeline is up and running with the new features this is the sample of the pipeline code and as you see like you know and when I talk about this sample basically there is not much learning for the development team into this the learning in the sense the development team should understand how to create Docker images basically they should understand how to read or update an email file and that's the only two thing I expect from a development team when we say that they can extend and use this pipeline by themselves in this we have three components basically one is called global section another one is called global environment section and third one is steps and that's all is all about this this whole pipeline in this case the global section you have to specify application name basically what application of the microservices you are going to build you have to specify the name and then you have application version so as you see we have branch wise versions here like master has two dot four dot two in this case the feature branch has like two dot four dot one the reason being is here having the branch wise version basically if you have a master branch and you are creating creating a child branch out of it or a feature branch you can actually have and when you are creating the feature branch and you are doing the pipeline your pipeline is running so it's actually getting built on two dot four dot one but when you start merging your feature branch into the master you will get some time conflicts in the sense if the feature branch also has two dot four dot two and it will override onto the the master branch in this case you can keep the version separate for features and master that's how this app works then you have global environments so think about a situation where you are writing a program basically and in the program or a cell scripts or any language you are you are defining lot of environment variables which you wanted to use throughout in your program like you have the different classes or you are and you wanted to use those environment variables so similar things you have to define here now the next thing is basically steps so steps are nothing but what all different activities you are going to perform as a part of your pipeline so you have to define now step has minimum three components here which is name basically jar build i would i generally recommend people to use hyphen to make sure like you know your ui looks good instead of space here and then you have an image which is the step container so this image is basically contains all of your tools which are required to complete that particular step so in this case i am having a maven build and i need a jdk8 so that's installed in this image and then i have a command section so when i talk about this command section basically it's nothing but it's a it's a plain ground where you have like it's a vm for you now to start writing all different commands you would like to execute in that particular image when it start up as a container so right now the example is basically given as with one command you can create multiple commands basically if you understand the concept of yaml file when you see the hyphen it means it's an array of the commands you can write as much many commands as you want the second option is basically you write a script in a cell script and then you can call the scale cell script here but the cell script should be available into your source control tool third one and all the all the environment variables which are we have defined at the global level are basically you can utilize them you can see with the build container the environment variable we define at the environment section of the global is basically we are using into the image here similar way now you have another step you want to do a docker image build once you have your jar file ready you wanted to use the jar file and create an image out of it then you write the next step uh step basically hyphen name again the name step of the name image and command so you keep writing this with all the step containers using into it and your pipeline keep enhancing so as everybody I think people over here are very well versed about the pipeline library which is a groovy code which we have to write now all of the people might have experienced that if we provide a library to a development team and we ask them to start extending that library using groovy I first of all this groovy is not a normal groovy which is not a specific it's very specific to Jenkins and it's hard for a development team to understand the internal internals of Jenkins and how they can actually convert that into a groovy which is specifically used for pipeline but in this case if you see this is just a complete ML file you have to have the containers available and you should know what commands you are going to use against them and that's is all about this this complete pipeline pipeline framework and that's why we call it as pipeline framework because the framework don't need to be changed in the sense until unless you have a great features coming in and you need to have it distributed to all the development teams then only you just go back to your library now and change it the rest of the features for example you are going to have a jmeter you don't need to go back to your library and add those dependencies into your groovy code but rather you just create a container right now and that will help you to extend your features and the capability in your pipeline pipeline value so most of things which I just was speaking about is here so it's a framework where the pipeline execution is defined by each team basically so anybody in the development team you don't need a specific person or a specialist in your team basically to just work on the pipeline extend the pipeline features no any developers can do that in the yaml file easier and faster onboarding so and this is proven with us that any application or any microservices if you have all the information in hand you can actually onboard application in less than one hour on point pipeline and that's what we have we have been doing with the with the within our team mobile so no big yaml or groovy script minting yes the same thing which I just explained that it just you don't have to have a big yaml in this case also we are using yaml and if you have like 20 different steps to perform you will absolutely say that like you know the yaml file is getting increased again it's a big yaml but for to solve that problem we have actually introduced templates which is the next step your usable templates so the the templates when I say basically is every step which you are writing into the yaml file can be converted into a templates and the same templates let's say for example you have 100 plus microservices you do the same way build may one build you do you have the docker builds again and then you are using sonar cube and other different but they are all the microservices are following the same way of having your pipeline or what building your pipeline now if I use this particular yaml file then I have checked out 100 different repositories and I have to check in the same yaml file over and over again into that let's say I got another features need to be our capability need to be added into the template now what will happen again I have to check out all of my 100 repositories and I have to update my pipeline dot yaml which is a definition file but what if I have a template if I create a template out of that repository in a different repo altogether and maintain all the templates and then include those templates into my main pipeline dot yaml file what will happen is tomorrow if any change happened in capabilities and features basically you will be doing that in a separate repository all together and which is not not impact your current pipeline you on some cases I have seen that if you change any pipeline specific files it starts a build because it's a auto build as soon as you check in into a code repository it start another build which is not required sometime because it's it's a pipeline specific file it's not a code change but when you have a separate repository all together to maintain these templates it means you are not changing your whole pipeline dot yaml file in your source code and any changes to these files will not do a rebuild of your code active that's how these templates will work and I will I'm going to actually show you into uh how this template works and how you can include easily into the into the pipeline dot yaml file is a definition file step containers again I already speak about it spec step container is nothing but you have a specific container for specific task in this case and you have in this in our case we have a library for that and you can use the the docker hub to have these containers to include in your consistency of approach and application across so it's the same thing because you do the build pipeline all the steps are similar same approach same consistency across all of your microservices when you start building it low maintenance because of the templates we have the templates reusable templates and the step container library which are placed at one place so you don't need to maintain them all of your microservices code base this is the sample pipeline screenshot I'm not going to do a demo right now on this one but this is how it looks like in Jenkins when you when you have different step containers and they are building through the pipeline I will switch back to the to the wiki which is on the github and I will walk you through few of the features and capability of the pipeline now so here's the wiki for for the point pipeline how you basically do the installation of the pipeline so I think we have just updated pipeline engine masters like the infrastructure which we have built with us which is a complete automated system for us to stand up in new Jenkins all the steps and the source code is available here for you to stand up the Jenkins so you can if you start you know following these steps you will be able to stand up a Jenkins within within minutes that's that's for sure and then how you do the library setup so you know that there is a plugin called global pipeline library in Jenkins you just need to have this code ready and then update the configurations into the into the library global library sections and you will be up and running in your Jenkins in our case basically what we have done this is again automated along with our Jenkins when we stand the Jenkins up so these things are very pre-filled configured for us when the Jenkins comes up now how to section basically it's uh the getting started with the pipeline so when you start working on the pipeline you need two files one is Jenkins file another one is pipeline dot eml file and I think all Jenkins lovers knows that the Jenkins file contains where is your library basically which you define into the into the plugin just now in in the previous section so you just need to provide those details here and the Jenkins file should be part of each microservices in in your repository and it never changed until unless you are actually changing a branch or any other information related to that it doesn't change frequently and then we have pipeline dot eml pipeline dot eml is nothing but it's again the same thing like you have the steps global section and the steps defined within the pipeline dot eml file now what core plugins we need so this is I think uh in between it came in the step so I will just let you know these are the four plugins core plugins you need to set up the whole Jenkins and to make it make the pipeline framework up and running you don't need more than that you can install more than like we have 16 of them it's because we are looking for different other other features in the Jenkins to work on like Splunk we are logging our logs into the Splunk so we need a Splunk plugin that's why we install extra plugin otherwise the pipeline works with these minimum four plugins now we'll talk about what different things you can write into the pipeline dot eml uh and thanks to Larry uh he has uh put the schema together so if you are not sure like you know any any of the components or any of the the the variables which you are writing into the into the pipeline dot eml are the integer type or string type or what should it be the length or different validation I think you can you can go through with this schemas and you will understand more about it now we have different sections in the in the pipeline dot eml one is had a basically so it has the version and the pipeline section here then you have the global section where you define the application name and the application version after that we have a global environment variable where you specify the the environments which you are going to use into your step containers so again and then we have steps so these three uh three components will be the part of the pipeline dot eml now let's walk you let me walk you through with the step section basically what different things are what what different components or the variables you can use into the step container in this you can see we have environment conditions secrets and control so I will walk you through in a briefing about this so a minimum step containers looks like you have a name you have an image which which is your container when it will spin and you have the command section you can write the command or you can pipe them commands pipe multiple commands into single or you can have a script to execute in this section basically now we have something called specific environment variables you can define within the step which is basically it's like you have global environment variables you can use across multiple steps and then you have specific environment variables which you would like to use within a single step and that's what is this section about so you can define the environment variables here now you see if in this particular image tag I'm using something called pipeline underscore app underscore version this is wherever you see like pipeline underscore these are the naming conventions these environment variables are already exposed by the pipeline library these are not user defined but you can override them as you start writing your pipeline or the command section you can start overriding them and that's possible then we have something called condition so the conditions basically let me give an example of you know how we use condition so in this case we have event clause which says branch master the the reason for that the meaning of this basically is this particular step will only execute when the branch name is master so let's say you have you are working on multiple features and you have taken out branch out of it right now with a single pipeline dot yaml you don't want to to you know delete or update or any of the steps from the single pipeline dot yaml which got inherited from master to that feature branch to make sure that I just wanted to execute something to deploy on QAT environment but I don't want to do it in my feature branch right or this particular step I don't want to execute in my feature branch in that case you can use the vent clause and that's where you are actually writing I can only execute this step for the master master when the branch is master now there are different ways to define that you might have you might want to write multiple expressions for that I have a master and release branch this particular step I will execute in master and release for both the branches but not for the feature branches in some other cases you can write a feature slash star basically it means this one of the particular step is only going to execute when the branch is feature slash star there is another way to include and exclude as well so in this basically we have master and feature slash star but in the feature you basically wanted to exclude few other branches to execute that particular step so this is how and I think this is one of the important thing when you when you have a you know parallel development going on and you don't want to execute all the step defined in a pipeline dot yaml as a part of your execution now there's something called environment condition so for a particular step you wanted to skip that step because based on the environment variables so two type of environment variable we talk about one is user defined environment variables and another one is pipeline exposed or the standard environment variables which now in this case you have a vent clause and you can say environment basically pipeline commit message if the pipeline commit message is skip ci exclude that so it means that particular step as soon as it will see a message is entered a skip ci it will exclude that particular step from execution similar way you have user defined environment variables here which is depler environment qlab or qate basically so what it will do is if this particular step will only execute if it sees that particular step is for qlab the environment variable is true and then in the next example you can combine the the user defined it's a it's a little complex one you can you can have that kind of condition you know defined in your step as well to execute so further if I start with the secret so now we talked about step containers and I think by this time you understand what we mean about the step containers right any steps let's say if you have an example of you wanted to publish your artifacts into an artifact tree or a docker registry right in that case we usually store when we maintain the pipeline engine we have global credentials those credentials can be used by anyone but in artifact tree or in the registry docker registry there are specific credentials required by each team where they are authorized to place their artifact now those credentials either they have to put into the folder level but when they are writing a step how we inject those variables into the the step so that's why this secret is basically secrets helps you know the section helps them to define right now I admit that right now this the poet pipeline only supports two different kind of secrets one is username and passwords and another one is secret token basically the single token so only two are supported right now so we need to work on the rest of them like SSH if you have to use and there are different other credentials types so in this case you can see the source source is nothing but it's the credential ID in Genkin when you define a credential at folder level or at global level that's the credential ID target are the variable names where you store username and passwords now they becomes a variable within the container which you have specified here and once they become a variable you can use them into your command section to to run your commands and that's why the secret sections help us to do the jobs which requires credentials to run that particular step now we have certain control options as well which is time out in minutes so this particular this is one of my favorite basically because I have being an administrator or like you know taking care of the the whole infrastructure I need to make sure that you know the resource utilization is is good for me and for that if a job we start running and get hung at a time so we sometimes don't know that how long the job is going to be hung either we get a report back and say we have to kill the jobs so in this particular case we have put a limit of like 30 minutes I think I'm hearing a echo so okay hold on guys um I'm hearing the echo actually so none of the rest of us are hearing it Rava you're sounding great although if you would like I would be delighted to interrupt you again with some more questions are you at a point here are you be okay with being interrupted yeah yeah of course he's good okay so one of one of the questions was related to parallelization and I think you had you indicated that this is not parallelizable but then Larry answered online that oh hey you can parallelize inside a container again the the fundamental concept is a container here that that's for me a really elegant concept that you're using right we had another oh yeah I would like to answer I mean um that so there are two kind of parallelism we talk about one we talk about can we group multiple steps to run in parallel and in a single group right second one is within a particular step container which is a like it's a container based so within a step you can you can you can run something like in the cell script or a tool which can do the parallel execution of the commands so parallel command execution and parallel step executions to different thing parallel step execution right now uh I will I mean I have that slide too to explain that that's a functionality gap right now that you cannot execute parallel steps but yes you can do parallel command execution within your step thank you thanks very much yeah next question please oh next question okay so um we had a question are you using azure or oracle cloud infrastructure what's your what's your chosen infrastructure hosting maybe not so much as vendor as technique it seems like it's all kubernetes based could you expand a little bit on that so yes uh we are using uh it's kubernetes based for us but you can actually the way we have designed this whole uh automation for uh provisioning the pipeline engine which is Jenkins you can actually do it on your laptop too so you can have uh you can have the kubernetes on your laptop and you can install so it's kubernetes but the azure you can spin off kubernetes as well as on our aws using ecs plugin yeah right so just a little bit more there for folks so we uh internally uh other teams there's teams responsible for platforms uh here and one of the we have a pivotal came out with their own kubernetes platform we went through about three kubernetes platforms uh heptio and another one and then we ended up where we they settled on was the pivotal kubernetes it's just it's kubernetes with pivotal kind of putting some of their own ui from it but everything that we do and do it it's from the perspective of using helm charge to get the deployments going so if you were to use aws kubernetes absolutely or azure uh this is it really should be pretty much a fairly straight because we're using helm and charts uh to do those things for our deployments so so i find that especially interesting that means you've been through multiple kubernetes providers and still using these same concepts so that you've you've confirmed by hard experience that the concepts are portable across kubernetes providers that's really excellent yeah i think a lot of the pipeline we went through i think i would say two kind of the we we were doing additional container work before that and then as they really settled out on uh last year settled out towards in the year going in the beginning of this year to the pivotal platform but yes uh it's been good to see that and the docker container is generally a docker container all rules that people need to do to do good design for containers and security and all that uh apply you still need to be mindful of those things but absolutely works that way and then as the ravi had had that diagram up earlier that talked about where we didn't uh how do i put it we we showed the aws and it says vm quote unquote there's a few times where we've needed to have a slave and something to run that was just more efficient running it not in a container but it needed to be on a vm might be a lot of files that needed to be there that need to be persisted or took so long to download all the libraries so there were some things like that that we've done um and just you have the ability to also people can spin up their own other types of implementations like that as well thank you thanks very much martin okay so yeah please go ahead mark do you have any more questions no let's let's pause for questions and let you resume ravi excuse the disruption and thank you for your patience with my asking questions thanks very much uh so yeah we were talking about like you know a few of the other features which is control option within the pipeline which is time out in minutes so by default uh any job which executes through this pipeline basically it get aborted if it doesn't get complete within 30 minutes but if there are jobs or the steps you think that will take more than 30 minutes in this case example you can see we have specified 120 so that particular step we don't get aborted for another two hours but after two hours if the job still don't get complete then it get aborted and that's how you can specify within like you know if takes less number of minutes you can specify that or if taking more than 30 minutes you can specify that too for a particular step and that's how this option is used and this is very efficient because it helps us to do the resource utilization if my agents are actually tied up with the master for a long time with the job stuck I cannot provide my resources to other jobs and when I implement this feature basically we can abort the job with our stock and then free up our resources and allocate it to different jobs something we have like continue on error which is basically when you design your pipeline you know the best that like if there is a step which is failing over and over again but you you would like to go with the next step to see and proceed further in that case this continue on error will help for the particular step so if this particular step is failing and you want to do markers true so even if it is a failure it will proceed for the next step executions and that's how this this continue on error is is used now let me just go over with the standard environment variables so there is a list of standard environment variables which you can you don't need to declare them you can directly use them into the pipeline steps here and then if you would like to know more about the environment variables there are more into the pipeline state JSON there are hundreds of them you can directly use them into your pipeline you can overwrite the variables as you like and as you go we have few pipeline control options so the reason having this pipeline control option is basically they works on to the genkit file so earlier I said you don't need to update the genkit file over and over again it's a one-time activity when you place it in your source code but there are certain operations which you would like to do like for example you would like to do the log level so as of today if you see when we start writing the pipeline library in groovy we utilize this genkin's file heavily and all of the features and capability we had added to that so the similar way you can actually if you have to change the log level you can you can do that directly into the into the genkin's file right now by default the pipeline basically works use a file called pipeline.yaml which is a definition file but let's suppose you would like to use a different yaml file the name is should be different in your source code then you have to come back into the genkin file and you have to specify that here once you specify it will take the different pipeline yaml file or the definition file and start the execution over that and the start parsing it. Now the last features and which is very interesting and important is its templates so I already explained you about like you know why templates are useful they are usable for us so let's say we have a there are two different type of templates you can create you can have a location at the local repositories where your source code is available or you can have them in a remote repository all together and then include them so in this case you can see the step library which is uh sorry the steps we are writing from into the yaml file now I'm converting them step those steps into the templates which is I have a template folder I have a slack dot yaml this particular step is for slack I took this out from the main pipeline dot yaml and I put it into a different yaml file which is slack dot yaml that becomes my template right now and I'm keeping that into the same repository now what happened is when I'm start writing my pipeline dot yaml the main definition file which is in my code I write the pipeline I start the steps I can write the steps as I like as well along with that I can include with the hyphen include option the templates which I have just written within the same repository and that's how you can include those templates in your main yaml file and there are different ways of doing that even you can create a template for your global configurations you can create a microservice environment level configuration and you can just put that into that also into a template the shortest pipeline dot yaml you will see is basically pipeline include and the template that's the shortest one you can see so we uh we recommend people to use uh pipe templates like template and template within template it can go deeper but we don't recommend to go by that route the reason being is if you have a number of microservices you are using but you don't want to change your pipeline dot yaml over and over again when you add or update new steps or new functionality or new features into your pipeline so what you basically do is you can take the whole thing out into a template outside and include the main template into the into the pipeline dot yaml so any changes further will happen will happen into the template not into the main pipeline dot yaml file or the definition file now let's say uh let's talk about remote repositories so if i have a remote repositories and how can i include that so i have a remote repositories where i'm creating a template here with the steps you can include multiple steps within the same template as well so that's also possible so it depends on like how you wanted to utilize these templates there is no hard and fast tool here on this one basically when you create templates outside of the code repository you have to use something called resources and then you have to start adding those repositories where your template resides so again as we understand you can it's an hyphen and it's an array so you can include as many repositories template repositories into the section keep repeating that so if you have five different locations where you are your templates are residing you can write all of them here in this case you have to provide the template name the URL where the template the repository of the templates and the label is nothing but your branch which branch actually they are residing and credential id so that you can uh the pipeline have access to that those templates and now you start writing the pipeline steps you include them with the iterate template with their name and you can start including like this and that's how these templates you can uh include from the remote repositories I can there are more details I think you guys can go through and let me know if you have any questions I will switch back to the presentation now I need to put this in play mode for this so I'm handing over this to Larry to talk more about the internals of the pipeline thanks Ravi thank you thank you um yeah so I wanted to cover I mean kind of at a high level but you know the the engine design and what we thought about and how like things are organized and so as you can probably tell from the examples Ravi went through it's heavily inspired by some other modern container-based pipelines like drone um the azure pipeline cloud build and the bit bucket pipelines and we kind of stole some of the best ideas and merged that with our thinking around pipelines and the features we provided um I think it's also nice because if you have developers that are familiar with some of these other pipelines they should feel pretty much at home you know it works they're all very very similar um and in fact we even thought about you know writing converters that go to and from different formats that people wanted to play around different pipelines um so some design goals for the the code itself you know we wanted things to be really easy to test so it was a big goal making things testable and so we ended up you know really kind of isolating a lot of the junkins the junkins functionality and as Ravi mentioned limiting the number of plugins we used just to make things really testable and be able to test quickly the core code itself we also wanted to structure things so that we didn't have to change the core code very often um so the main exception point really is you know by is containers adding new containers to do that um and we lean on this you know very heavily and also we're not the bottleneck here other groups can add their own functionality whatever they want in fact you can use um you know third party you know containers any you know container totally fine uh the execution engine is totally generic and so this was another big goal and something we had to push ourselves on a lot was that the pipeline engine doesn't know anything about the containers like I said you can use off-the-shelf containers like you know we use the Gradle container just directly from Docker Hub totally fine um and you know just it was a little bit of effort right it would have been easier if the pipeline engine knew certain things about the containers with the containers about the pipeline engine but we really you know try to make this generic if you get paid it paid off in the end um so the pipeline handles all the step execution and arrow handling and things like that but has no knowledge of the internals of the of the containers at all the containers have access to the pipeline state information so this is something we do expose like Ravi mentioned um there's a lot of built-in variables um environment variables that are all in the wiki and besides just some kind of simple information like you would see in a normal check-in shop we also have um one of the variables points to a file that has the total state of the entire pipeline so if you want to implement a container that does things like you know updates a dashboard or sends a slack or email notification you have the full information of the pipeline state available on json file that's you know like each each step it's its status what the steps were and everything like that is all available as a step um also the steps shared the same workspace so that was a really important way to share information between you know from step to step if you think about it's kind of normal for a pipeline right like your your building code and then running tests and you know obviously it's helpful to have access to the same workspace and that's another way you can share information between steps is by running across the workspace okay so the implementation so it's implemented as a check-in share library so if you use one of those before it's basically you know it's doing the same thing like Robbie mentioned it's using like groovy the groovy cps code um a very small core like I said we wanted to push functionality to the containers as much as possible including things that you would normally think about as built-in and we thought a lot about built-in like you know we had a lot of debates internally about how to implement like things like notifications and reporting um it seems like those should be built into the pipeline but again we pushed ourselves to make those just normal containers like everything else and that's that's how they work uh you get a limited use of plugins to keep things you know small and testable like I mentioned before testable code is a priority uh currently it's 117 unit tests um you know always good to have more we also have a lot of just uh this thing is 58 test pipeline files in there right now so that as we add changes you know we want to make sure that we can move quickly and add new functionality if we needed to without breaking any existing customers you know having background compatibility was again a big big priority um also the pipeline builds itself you know so um you know if you look in the repository there's a pipeline that you have a file that we use you know it basically just runs the tests and does some slack notifications I think um but just to make sure we had a lot of exposure to the pipeline as you're building the pipeline and you know able to test out features uh right away was was uh important so I also wanted to cover how do you extend the pipeline functionality and so as I mentioned like the main extension point really is by adding new containers and so obviously things like build test deployment notification those are all there you know but like I said reporting notification even our plan is there and that ended up being really useful because even things like slack notifications we ended up with like three or four different ways to do it because some teams you know just want a really quick simple slack message like hey something didn't work you know go to this link and see what happened some teams wanted like you know each step as each step completes update a slack message um so you get like step by step progress in you know in real time um we even added like kind of a slack bot to handle some of this functionality and so I think by keeping things um separating this way ended up being really flexible we didn't have to like again modify the core pipeline code because one team wanted different notifications it's all totally user extensible also the template's like a question uh no um Larry I would like to I mean do you think like I should um I have this link here so which is uh the shared library um we have so they are like so this is the shared library which we use at T-Mobile basically and it has like 40 plus different generic containers for each of the uh each of the tools and technology we are using in T-Mobile it's all listed here and the the best thing what we do is here is if you click on any of the containers or you can do search we we provide the the step image and what versions it has and how you can recommend you uh recommended uses for that so the team don't need to like you know search or like type anything they just can copy paste this whole thing into their definition file in the step and then fill in the details and start using it so this is uh this is how and you can search any of the containers use typing you will get all of the the build types and you will get the information so yeah um circling back to you Larry yeah thanks yeah thanks for that's a good uh yeah and so a lot of those containers um you know were built by our team but a lot of them were also built by other teams and contributed you know into this bigger poll which i think was really awesome and um yeah that website um this is another shout out to drone um we kind of stole this idea from them they have like you know similar plugin index for for the plugin their plugins you know are working the same way um since we took their they have an open source website for their design of that we use the same thing and in fact a lot of those drone plugins like I mentioned there's nothing special about these steps they're just any container really um you can use some of the drone plugins in our pipeline too and I think in in our wiki we have some examples of like for slack using the drone slack plugin just as is there's nothing special about it um oh I just wanted to mention that the templates are another um back on the previous slide Robbie like the templates which we talked about is another really good way to share functionality so um they really allow you to abstract out complex logic and sharing so like for these containers that that we built you know a lot of them have the same kind of build substances like you know you're building some source code and then you're building the docker container and then you're um testing the container utilize like the google container tests um to do a lot of testing and then you're you know publishing the container and so we ended up just for all these we have a template that's like you know build standard container and that way everyone working on these containers can build them in the same way in a repeatable way um and it's almost like a mini it gives you a mini api into more complex logic so you might have you know a template that's like send notification and internally it might you know call a slack bot and then maybe do some do an email and do something else and you can just expose a very simple interface to developers to reuse and even can have its own inputs by using environment variables they can provide when you include the container and that ended up being really powerful okay so um another sensibility point is this idea that Robbie touched on earlier a little bit about pre and post steps and so as we're building this out again we wanted to keep the core really small but there was some functionality we wanted to run every time and so originally yeah we had the um like reporting here the show is like the influx reporting we ended up building and that's implemented as a container but we had to add it to every single pipeline and that you know got to be a little burdensome even though you can like I said simplify that with the um templates what we ended up doing was like at the check-ins administrative level you can define kind of um uh a global template repository and we'll automatically load some pre and post steps and these are kind of mini pipelines and so there's nothing special about these it's almost like basically we add like a pre pipeline the user pipeline and then a post pipeline and so you can define ahead of time like an example would be let's say you didn't want the pipeline to run unless all the images in there had been scanned by an internal scanner you can implement that as um as a pre step and then you can fill out the pipeline before you can get to the user code and for the post side like if you're going to do special reporting like we did for influx or something like that you can you can do that too um what they want to see about us and these are all um again implemented as pipelines and so all the stuff that Robbie talked about before normal pipeline like it's all container-based it's a list of steps you can use environment variables use conditionals it's all the same uh that was another thing you know we wanted to keep the syntax like the small so that we don't have a lot of concepts but we kind of reuse them in different places to make it you know it's very flexible you can do a lot with it but there's not a lot of concepts you really need to know to do any of this of your PR so again if there's if there's something you want to you want to add um everything is open source and like I said we have a decent test speed already and so you can feel confident you can submit a PR and you know as long as they pass the test and you know add new tests for the new functionality that'd be great I know a lot of people are asking about doing um the parallel build so it's something we we don't have we talked about um maybe some other things um so you know we'd love to see contributions and you know we're happy to discuss approaches if you want to like open a PR or or issue you're not sure where to start we can help you point you in the right direction for sure Larry Larry that's that's a great thing for me to hear that okay this is available open source your team is the team at T-Mobile is continuing and willing to listen to pull requests evaluate them have discussions about them uh that's that's really amazing any concerns you have or any things where you say oh please don't don't go this direction or that direction um I'll let Martin Ravi answer that I think I think one thing is if it's something like for example the parallel steps or something would be a big change it's I think it probably would be helpful to have a discussion about it first because there are a lot of like something like parallel steps for example the reason it's not there right away is because it seems like an easy thing to say let me run these steps in parallel but it's actually very complex right if you think about it because what do you really mean by that do you mean like I want to run multiple steps on the same agent or do you mean I want to have different agents like in the Kubernetes cluster running steps in parallel in which case the workspace is no longer shared how do you merge the workspace and how do you kind of branch off and forward and so it's obviously very complex and so if it's something like that right obviously it would probably be helpful to start a discussion first about your approach before you invest a lot of time in it to it because there are a lot of trade-offs right and so it's something to keep in mind but I think Rob your mind can probably answer better about yeah something else I'll just mention that is also definitely familiarize yourself with the pipeline the concepts and and run it a bit one of the things that we did run into is that some teams would say hey we need XYZ plugin added in where really this the answer was no we'll do that with a step container and you implement the step caner but it's sometimes hard to get out of that habit of oh I get a plug in I'll add some more groovy code into my pipeline to make something happen and so there's a some just getting used to that for some folks was a little bit once they got it then they're like oh rock and roll and they were they were fine but that would be another thing that I would suggest is definitely familiarize yourself the power of the template some of the conditioning I saw some you know great questions being asked about hey what do you do with errors and Robbie kind of touched on that a little bit we do have these when conditions and things so definitely a good place start getting used to it familiarize and then you know go under the covers a little bit take a look at that code that Larry developed get yourself there and then yeah absolutely like Larry said post an issue or other things for questions and we can talk about some of those things to help folks if they've got those types of inquiries what do you know another thing to keep in mind is that if you're familiar with like check and share libraries you probably know this but if not it's maybe worth mentioning that it's groovy code but it's not really groovy code because it's executed by Jenkins using this continuation passing style and so if you look at the code and you might notice there's some kind of funny things like why are you using these simplified for loops with like for I instead of you know you know a more modern for loop it's because it's not supported by the check and CPS groovy and so that's something to keep in mind any more questions Mark or yeah I've got I've got several so if you're okay are you okay going into some further questions so just let me finish this slide and I think we can take the questions and answers like you know for another rest of the time thank you is it okay thank you so I just want to touch base with the pipeline limitation of functionality one of them we already we talk about the parallel step education but right now it doesn't support Windows and iOS build but there is a way around we have done for the iOS build basically it's all about containers so within the container if you do the SSH and you can reach to the to the server basically you can deploy or you know you can take care of the iOS builds too in that case but right now Windows you have to do a hard wire connection of the with the Jenkins and then use the freestyle job or the other jobs the way you you write it currently and then just to summarize it a little bit so if you if you see like you know whatever we have speak you know the values we have added into the T-Mobile using this pipeline is is not only one component but there are different components which have really helped us the infrastructure which we have built the pipeline library of the framework which we have built and the the kind of support model which we have within the organization so which has allowed us save a lot of money and you know added values with us you know the company and the customer focus you know a driven approach so these components has helped us to you know make sure that we add you know the values into the existing system using this pipeline library by this I conclude from my side and yes we can take a question and answer here are our contacts if you'd like to touch base with me Martin and Larry thanks Mark thanks very much Ravi so so I've got a couple of questions that have come to my mind I want to I'm going to give personal bias first so forgive me putting myself at the top of the queue to ask my questions so you mentioned customer support as a challenge were there techniques you found as you were helping your customers adopt this that were helpful and others that you found oh we thought would be helpful and ultimately we're not helpful in getting adoption so yeah that's a that's a really good question actually so because I particularly mentioned that we were customer focus so all the features if you see in this pipeline what we designed and you know while we were doing the design discussion actually so you know even Martin Larry and a lot of other folks from our team we usually talk about okay we we are designing this particular features and we keep the customer in focus that whether this is a simple approach for customers or the people or the developers who will be using it if it's not simple then we have to go by other alternative approach so that's the one thing we do so the whole design was completely focused for the customers uses for the customers actually second thing this pipeline was so simple that for me it was easy to to get trained teams so what I usually do is when we start onboarding people on out on on this particular pipeline because we are a centralized team but if you see we are like I think I haven't mentioned we are people of like a team of seven people right we maintain the infrastructure we maintain the pipeline too but the pipeline library is we are not extending it over and over again we don't need to write the groovy code always so what happened is it's always extensible using the docker containers right and what approach we took is we trained the development team I usually like if I tell you correctly I have given like less than four hours trainings for each of the teams in T-Mobile and to make them train and let them do a POC by themselves do hands on and then start working on this pipeline so the the benefits I got from it is basically the collaboration in the sense the development team start creating their own containers and they start actually contributing in the library itself and there is a process actually we followed so when anybody is like want to contribute in the step container library we actually have a review process for that our team reviews that and then we merge the code into the step container library source code and that's how they are available for everybody else and we wanted to make sure that like you know the step containers are generic so that everybody can use it or extend it by using it as a basing is as well so the training is one of the things regular feedbacks so on the quarterly basis we actually reach out to team the whole number of teams and we get their feedbacks whether the the pipeline is actually making sure that you know we wanted to make sure that the pipeline is fulfilling their requirements and as Martin mentioned that you know our customer satisfaction which is CSAT score was really great which is approximately 4.9 out of 5 because the teams were well trained the value addition was great for the team and teams were able to execute you know the task by themselves they were able to onboard their microservices easily onto the pipeline and that's why this is the approach basically we have followed. Yeah I think also just to add a little bit more there as I noted earlier our team one of things our team was responsible for and still is is helping educate and get teams into some of the newer technologies we were a large organization and working very hard to do digital transformation shift that move that ship turn it in a different direction and part of that was for example the containers and step containers we ran into a variety of teams who had not previously really engaged and used containers before so this gave us the opportunity to then educate teams on using containers some teams were they were there awesome we can roll with it others we just needed to educate them but this was important for us not only did we do it for the pipeline but we did it for the organization they good at that so we really want people were moving to container development you get away from all of the VMs we had a large legacy system and platforms that we've been migrating away from so these were all other benefits we got and then things like our documentation continued to get tuned how we trained every time we go to the teams and talk we know we get that feedback and oh okay something wasn't we didn't do this well so that's when we talk about the technology we really had to also be open to the what the customers were telling us and if they tell you no you know I always talk this is a sales a key thing about sales is when somebody tells you no and you're trying to sell them something which I said this is a sales thing here we're trying to sell them on our pipeline we weren't mandated or anything like that but then then they say no then you know why are they saying no you know where's the behind that it might be that you know sometimes it's as simple as well they already had a pipeline and we're taking away their job as they view it and so you might have that but other times it might be well you're just going to create more work for me or they just didn't understand so we spent the time to talk to them and that really also shaped in how we continue to talk with teams get better at that improve it what we did for our documentation etc we got some really good compliments about the documentation we had which and unsolicited feedback from teams about just how much they appreciated something and I don't know about you guys but my experience as software engineers of which I'm one tend to be a very cynical bunch and when you start getting unsolicited positive feedback that that was huge told us we were on the right track thanks very much so we have a question related to your kubernetes environment can you give us some hints about what's its relative size and how much how do you how does it scale out for you how do you watch it those kind of things is that part of your team or is there some other organization that actually shepherds the cluster so this is part of our team so the the kubernetes clusters they are managed by the it team basically they host the kubernetes cluster we are the users for the clusters but every like the namespace in the in the kubernetes clusters we have like two namespace two or three namespace which is our and the location is like based on the users I can't tell you the right the data right now like whether it is we have got 500 gb or the cpu limits I think we have done some calculations based on like how many pipeline engines we can host on particular namespace so that calculation is not on top of my head right now but yes we did a calculation the only thing which I can tell you is the genkin's home workspace like which need a storage that was less than 20 cb for each pipeline engine because we don't store any artifacts into genkin's home every artifact which get built is usually goes into the artifact tree or the docker registry the the logging we don't have the log rotation a lot of log rotations and the store is into the genkin's home the logs are actually getting stored into splunk so these are kind of different small small techniques you can say we have used basically to utilize the resources so so yes we don't use a lot of storage as far as I remember right now we have 14 or maximum like 15 pipeline engine or genkin's running on one namespace with that yeah I think I'll just add a little bit more there so to what Ravi was saying we the we'd have to go look see what the number of cpu allocations but as noted we spin up those individual pipelines for teams and so if we start pressing up against how many cpus we're allocating the good thing is with the the step containers and the agents that trigger the step you know help initiate those step containers those are all dynamic they come and they go so you just need it on demand so the the advantage to us has been therefore that we don't consume huge amounts the main masters have to stay there they're running it's been interesting as we see with some of our solutions when they're kind of idling away if you were on a full CICD and something is being built every few seconds we're not to that level so it hasn't been egregious but if somebody is really curious we can find a little bit more but we just work with our platform team if they really feel like we're pushing up against it we have monitoring tools that they provide us as well then we can always ask for more and they've been great to work with we've been one of the teams as they came on we've always been engaged with them very closely and so they've been very supportive in helping us so how do you test drive changes to poet before they go into production or do you do you just rely on your unit tests in the poet framework what's what's your experience in terms of rolling out new capabilities into poet uh Larry would you like to answer this sure yeah so yeah so we do have a lot of unit tests um we also have a lot of um that's not part of this project it's a separate project but we have some integration tests that you know go through some pipelines and do full like builds and deploy like and make sure that the poi succeeded and then you know make sure that the metrics end up in the database that everything worked out so we have we do have like internally a bigger integration test suite that we also run um and then yeah we try and be really um like really try to be backward compatible so we really haven't introduced many breaking changes so far um just because once people have something in place it's kind of you know harder to change and then the other thing is you know like Martin Ravi had mentioned it's like we focus on customer support and customer service and so um you know if if something doesn't have breaking we're usually in all the channels all their teams are in on slack and can kind of respond quickly and take a look quickly um but yeah I think the unit tests have been helpful the integration tests have been really helpful also and then you know the focus on not introducing you know breaking changes has been a big one thank you now how are you handling docker container component security concerns like somebody chose an outdated docker image they decided to build based on a Java version that is now six years old or something do you have systematic things or is that left to teams to decide how they deal with securing their own code yeah that's a great question so generally right now there are t-mobile guidelines oh they are ever evolving it's something that our security organization has been working hard to uh update to and get to and we have certain things that there are requirements of certain scans they do want to run those scans and then there's actually uh step containers we have where there's other logging and information that's supposed to say hey we ran these things here's all the data and so if somebody is doing those scans uh then that stuff is also being captured that's person of the t-mobile uh mandates but we generally it's not our job as we see it to be the uh in force for that though those post and pre and post steps can be a spot where those can be injected certainly there are some spots where if we really need to know what the pipelines are doing we can provide some of that behind the scenes capture it but uh we use this t-mobile guidelines uh look we generally work from a perspective of uh trust but maybe verify uh and that's really what the security teams do and audits and things like that if we saw something egregious just because we're helping somebody we might say hey look this isn't good our own our own pipeline our own containers certainly those ones that we're showing the library of we'll go through scans make sure they're adhering also to the appropriate uh thing so that's how that works as we really leave it to the teams to be the correct do the do the things are supposed to do within t-mobile guidelines thank you thanks very much thank you so are there there was some illusion earlier are there any substantial problems you found by choosing to use containers as the key element of your pipeline you you it appears that you've represented something that's more powerful to my mind than a simple Jenkins execute some code in the pipeline you're really using containers as the steps were there things where that you said oh but this thing we can't do it that way and if so any insights that you gain from that Larry would you like to answer or should I go ahead um I I'm trying to think of there's anything well yeah go ahead yeah I was gonna say the one I could jump to we talked briefly we've touched on a couple times I mean Ravi mentioned the iOS and the windows builds right the whole thing about the awcs2 ability those were really where we did have certain things where we've noticed if somebody if they didn't have maybe uh and Larry and can jump in and correct me if I missed on something here but some of the times we're doing a maven build and you had a lot of libraries that were being you know you're just got this big java build all these libraries that need to come in uh if you didn't have a lot of that stuff preloaded depending on the network of things you could run into some challenges a little lot of stuff all the dependencies being rebuilt uh and so there were some situations where we did do some work where we said okay there might be where we need a static VM or use the ECS so that was aws our ability for those who didn't know you know ECS is just a way to kind of create a server and bring up what you need on it dynamically which is awesome we don't have quite that level of automation here we have tools and things to do somewhat but look uh amazon and microsoft put a lot of money into building those types of automations so we looked at leveraging that so where we needed somebody who needed a little bit more of that having that cash available to them we could spin that up it would load up and then they could do what they needed so those were at least one that I can think of where the containers get you uh long ways but once in a while we had a few challenges and usually we can also work with teams to maybe revamp how they build their container set up some of maybe the dependencies ahead of time in their container yeah that's a that's a great example Martin and I believe we we struggled almost like a month to solve this problem because the source code was used and then this libraries which Martin talks about the main libraries uh how to share them across so we we actually go by uh the aws uh and the ECS and we can spin that too from the pipeline engine so we used lyse that VM uh you know to spin the agent and that really helped us thank you so one of the one of the questions that just came in recently was the individual had arrived late and didn't see anything in terms of how you do checkouts and other artifact management do you have sample docker files that you are sharing online that represent steps of what you're doing uh so how we do checkouts so it's like it's a it's a Jenkins pipeline library so the checkouts are done through the groovy uh the plugin I think the git plugin Larry correct me if I'm wrong and then uh the docker file which you are talking about um the step containers uh they are internal to t-mobile and uh they have like certain t-mobile specific information so it's hard to share them but uh we'll we'll internally check like you know how we can make it more generic for the outside people and we can share those containers as well the step for example like you know if I go into the documentation and if you would like to see the different step containers uh as a recipe you can see how we are doing the docker build basically but then uh you can use these examples if you would like to use but specific the docker containers we haven't shared across and that was something actually we just have been talking about over the last few weeks that when we look at what we put in open source um felt like there were a few more instructions we could provide uh and also uh look to maybe bring we as Larry noted we use the drone uh we we borrowed the drone code I mean frankly why build a whole new UI for managing uh our containers we did add a little nuance which was we could have a set of we'll just take it in terms of a project with a bunch of repos in it people could then ask for a repo there were they want to put their step container they could submit to us that they have one ready to go that they'd like to be added and we kind of had a master list it within another repo that would allow us to build that documentation uh and with the links and all the information for those containers so we talked about at least getting some of ours up there as examples but as Larry noted if you had just a very basic container a hello world container try some of the parameter passing I mean in the end uh which the examples do show kind of how to do all that uh can also work for you quite well yeah and again like the the public like the public maven container the public gradle container like those all you know work fine I mean any containers you're doing and and this is like a common approach I think too like I said before like if I you know haven't tried the google cloud build containers but I'm sure those would work or um you know if you're using a amazon code build like you know it's it's it's all the same approach right where it's starting the container running command the container so thank you thanks very much so so monitoring is one of the questions that was just just raised are there things that you have been doing for monitoring where you'd recommend hey others should consider this monitoring technique and are the things where that you've learned from your monitoring systems tell us a little bit more about what you monitor and why you monitor that so uh we have like different uh you know tools we are using like one of them is for the pipeline perspective we are using grafana the inflex db which is as part of container itself like the pipeline high availability availability of the pipeline and different matrices about the about the pipeline so that's the grafana we are using and it's in a container step container and then apart from that we have app dynamics uh we are using uh to check the traffic and you know the pipeline engine so we have a automation around where we keep hitting the the all the because we have like 28 or you know close to 30 uh Jenkins running in the Kubernetes space and then and uh so we keep hitting the URLs and we get like you know 200 okay and then we uh we put the this on the slack message so we utilize that for the slack messages when there is any problem or any like you know disk space uh you know issues or URL is not getting hit we usually get a message on the slack and the the teams take care of that yeah and then we also have uh our Kubernetes platform is tied into Splunk we're uh big users of Splunk here so we also have information and insights there coming about some of the other container runs we always are talking about you know we've talked about how do we expand it continue to do better you can always do better on your monitoring and those pieces but kind of those components are the ones that all come together to form up how we monitor and watch so there's information about how often our pipeline's running uh you know how many step containers things like that there's so there's that kind of information that's the grafana stuff we talked about the app d gives us more of a good performance information about how things are running there a bit if familiar that's a third-party third-party product out there that our organization well T mobile uses for monitoring application performance things and then we have obviously the pure hardcore logging stuff that also can go that goes into our Splunk systems thank you thanks very very much I think I think we've settled most of the most of the questions I've got at least one more this one you are also welcome to say I refuse to answer the question is have there been mistakes you've made where you think other people should learn from your mistake sometimes we make mistakes okay do not tell us anything that is super secret but if there are things that you say you know what we made this mistake and nobody else should ever make that mistake because we learned a bunch by that terrible thing so uh yeah definitely I will start and then Martin and Larry can add it to that so um it's been like eight years I you know I've been working with Jenkins and I have made a lot of mistakes actually I made a mistake when I started using a lot of plugins in my Jenkins my master's goes down and I you know I'm just scared like you know the next day what will happen and the teams will not able to use it so over the period we have learned like you know we should we should avoid using a lot a lot of plugins so that that we learn and that's how this um and then don't use a lot of a groovy code to extend your pipeline and that's how this pipeline came in picture and this this is part like you know part of all the learnings we did so far and the the birth of this uh this beautiful pipeline and the architecture which we came up with thank you yeah I just I'll just piggyback on what Ravi is saying a little bit I mean look all of this is the fruition of learning from our mistakes uh and where we could do things better and we could continue on and I'm sure even on you know in in this we even you know it had some spots uh where we we challenged ourselves uh often for us it was how we were engaging with the development teams or something like that uh and so we're always checking I mean that idea of being honest about where were your issues a biggest challenge and why this came about uh and Ravi kind of touched on it I I got frustrated because we wanted to be container-based but uh some of the developers in our team by default would just kind of go back to writing and extending the Jenkins code and it's like no no no this has got to all be abstracted and it's a little bit harder concept you've been working in just the pure Jenkins world uh you know you've got to kind of shift your mind around that and so it's like we need to reset and I actually came across drone I learned about drone and I went to Larry and said hey check this out Larry look at what this does what if we go down this path and it just rolled and I mean we actually got the initial core of this in two months we had running and people being able to start to use and then over another four five six months we then continued to get the other enhancements in and things like that but uh yeah so there's certainly even in some of the code and how we did things we're constantly challenging I would say even to date are you know our monitoring isn't quite as strong as we'd like it to be we'd like to dial that up a bit more but uh you again just learning from our pieces and those were just like that culmination of a lot of our other you know attempts and where we were and just keep working on that idea of continuous improvement Larry uh yeah I think more in summer is really well I think yeah the whole project is really like you know trying not to repeat the same mistakes and I guess I get tack on Taravi's um point about plugins I mean even with the shared library you know early on we ran into some problems where you know we're trying to get the library running on different chickens and this is that have different plugins installed and even plugins that we weren't using you know just there would be Java class path issues where one thing was using you know some version of some Java YAML library that another instance had a different version of it installed through different plugins you know and those ended up causing you know issues um yeah I think it's the other one is right is um you know complex logic should be testable right which is something we focused on a lot I think you know someone made the point that we can do all this in you know Jenkins file using groovy and that's true but I don't know how many people are have tests for their Jenkins pipelines probably very few not it's very easy to test and so you know trying to make things as easy to test as possible having that be a goal um is is important um you know I guess another thing is to have things documented that was again a big focus but you know whenever we had a new feature we made sure that we had documents for it and and um you know not not have things that were kind of hidden or secret you know and try to be have everything documented thank you very much so Robbie Martin and Larry thank you thank you for taking your time with us we so appreciate are there any concluding remarks you have I think we've answered all the questions that were on my mind the questions from our audience any specific things you'd like to do before we close our session I don't really have any else really myself other than you know really appreciate the opportunity we obviously we open source this are because we did feel like hey let's let's see out in the wild let's see if this uh has value for others out there and see what kind of feedback actually we're getting from the broader community out there as well so again just really appreciate I know people's time is very valuable and really appreciate everybody uh coming in listening in on this yeah thanks everyone yeah thanks mark thank you very much I'm going to go ahead and end this the recording will be after it's been processed it will be posted and available on the youtube channel for Jenkins thank you everyone for joining a Jenkins online meetup thanks a bunch thank you