 Good noon everyone. Hi. First up, I hope I'm audible. Right. Welcome to the session. You see it's basically me and a shout out to my co-presenter Arun who isn't able to be here today. The agenda of the talk today is largely around how we at Hike Messenger, I hope you guys have heard of us, stumbled across the entire DevOps journey. And now that we've reached a destination, for us it's still an ongoing journey. But the larger focus of the talk is what is it that we really do, how we stumbled upon it, where we felt the need for it, and how did we go around solving for specific challenges that we came across, and what were the generic tool chains and how we ended up using them. All right. A quick intro on who we are. We are a messaging app that's been growing up in India. We are an Indian company, around 2,000,000 existence. We have almost 100 million plus registered users. We have a presence on both iOS and Android, based in both Bangalore and New Delhi. Our primary competitor is in the market and this is exactly where the real meat of the talk comes into the usual big boys. The Whatsapps, the WeChat, the Snaps. Now, as you could see, the kind of competition we have in the market are most of the big boys. Either they are part of firms who have been doing or been in the game of scale for really, really long, much longer than we have been. Or they have been bringing their journeys and their experiences into the same field onto what they're doing in the field of messaging. So we already have the scale. We have the primary problem statement of serving multiple users, like a lot of users at all. The quest for user delight for us becomes a key thing. Messaging as all of you would probably know and even agree to is, I would say a user is a fickle entity over here. I mean, Whatsapp recently launched stories just like Snap did and people didn't really take very well to it. But on Snap, it's a huge hit. Google has been trying its own strategies and everybody's been trying to throw their hat into the frame. Apple has an entire platform that powers messaging. But to be very frank, it's very difficult to crack the holy grail and say this is what flocks the entire user base to us. It's a very social platform. It's an ultra competitive market. User traction is key. And while it's very easy to acquire users, all it takes is an app install. It's very difficult to have user retention. We also had to tackle the problem of a diverse text stack. We are a firm that usually goes ahead and says we don't restrict. We are not particularly a Java shop or a Python shop or you call an object C shop. We dabble in almost everything when it comes to our server backends. We don't even put much of a restriction on what build tools or what utilities we are really using. We have a micro-based services-based architecture. We have more than 100 deployable services in the back-end that come together to present what comes out on the app to the user. And each of these services has obviously the mandate of moving at their own pace. And therein the problem statement depends because you need to have dependency management, you need to have orchestration, you need to have similar problems. The last build is testability. Well, the moment you start talking about the scale, the moment you start talking about the number of services that we have, testing it and making sure that the user experience isn't hampered or isn't broken in itself is a big problem statement. Now, looking for answers to all of this, we kind of stumbled on what is that DevOps would essentially mean for us. Now, to be very frank, for us DevOps might be a very, very different statement altogether than to probably a lot of you guys. Our DevOps machine statement eventually was a very, very simple one. A scalable test-driven vision with no tech bias that allows us to hit market faster to delight our customers. If you look at this statement once again, you realize that most of our problem statements in the last slides are addressed over here. You want a scalable system which is obviously test-driven to ensure that we always deliver the higher quality. There is a needn't be a tech bias. Whatever platforms, systems, tools, methodologies we use need not be tied to a given tool. And we should be able to swap and swap out any particular entity in the entire ecosystem at any given point in time. And we need to hit market faster. I mean, if we have something that we think is really, really going to attract the user base, we need to hit it and get to the market rate much faster than our competitors. Or if our competitors are doing something, we want to probably follow in the same footsteps, we still have to move really, really fast. And delighting customers, as always, I'm pretty sure it's true for most of you in this room as well, is key. So this is what a 10,000 feet view of DevOps looks like to us. For us, DevOps was essentially finding answers or about solving problems in three key spaces. The very first one was around change management, which obviously involves incident management altogether. How do we really take a particular change? Shepherd it all the way to production, making sure that it's not blocking other changes. Also making sure that if it's not right, how do we take it out of the entire pipeline and let the others come in? A bunch of other things also come in here, which we are going to talk about. For example, how do we maintain a code? How do we maintain versioning? How do different entities in the system actually end up exchanging? Having handshakes with each other is also something we'd like to talk about later. The second bit was test automation. I'm pretty sure I need not speak a lot on this entire scope here because there has been an entire track devoted to test automation in an agile world. Let's just say that our problems weren't very different from what you guys would have been hearing over the past couple of days. But the third part is actually very, very interesting, which is configuration management. Now the reason why we said that configuration management is the third pillar in the entire exercise for us was because when people started testing all of these massive number of services and each wanted to test their own features, we simply said we can't test everything on a single environment. You can't really keep throwing everything on a single environment and say, hey, if it works here, it works here. If it doesn't work here, it means it's broken. There are issues with reproducing a problem that you would find. There are multiple problems. Maintainability of that environment is a problem. It's a single point of failure and so on. Now the need for having multiple environments altogether, the capability to have people provision their own test environments and so on. Also making sure that the wiring of those environments is pretty much as it is in our production environment led us to config management being the third major point here. Now, if you come down one level altogether, this is a broad breakup of what all we really wanted to take a look at. If you look at change management, SCM best practices, like we said, how do we maintain our code? Branching? How do we do our tagging? Our versioning? Everything else over there was something that is very, very instrumental to us. It's the first principle without which no matter how much you think you can, it's very difficult to achieve a higher state. Code cleanup and refactoring, obviously, when we started, we were basically yet another startup. As it happens, people tend to move real fast, accrue tech depth over a period of time and before you realize it's grown so large that other activities become really difficult. Software packaging and versioning, like I said, it's again something which is very, very important to us. We understood that while it's very, very tempting to say, let's just do a git clone on a machine and do it, in a lot of aspects it's not the best thing to do around there. Also, software packaging is very, very key when you try doing environment and configuration management too. The last piece of it is build pipelines. How do you really propagate code from point A to B to C and eventually to production? Do you really have machines doing it for you? Do you really have people doing it for you? Config management, like we said, config as code, which means you cannot have hard-coded entities or hard-coded endpoints in your entire code base. You have to have configuration as another deployable unit that you deploy along with your code. Every time you do a deployment, you take a binary package and you take a code package and you deploy them together. Configuration cleanup was also a similar activity that we had to do. When you look at code cleanup and refactoring, configuration cleanup was a corresponding activity, which is removing all those hard-coded endpoints, making sure everything is configurable and contextual. When we were doing software packaging and versioning, we realized that config packaging and versioning also goes hand-in-hand. How do you make sure that your deployment is just a code deployment or config deployment or both? We wanted to be very, very sure regarding what does a deployment mean to us. At the end of the day, the puzzle was very, very clear that it's going to be either code or config or both. Build pipelines obviously had to utilize disposable test environments. Every time you have to run functional test cases sometimes, integration test cases, component-level test cases, mocking takes you only so far. As in when you start moving towards the tip of the pyramid, you start realizing that having disposable test environments to offset costs of maintainability, of ownership, of other overheads that come with it becomes a key solution. I'm presuming that most of the folks here are also involved in some amount of testing. How many of you have actually aware of the same test pyramid that does the rounds in any? That's surprising. That's another philosophy that we followed a lot, saying in terms of testing, we want to be bottom-heavy and be as specific as we go to the top. Whenever it comes to your unit or component-level test cases, you're going to have a very high test density. As in when you move upwards the pyramid, you start making sure that while the scope of your test cases becomes wider, the number of test cases has to be short and sweet. That actually allows us to reduce a lot of time when we are looking at UI failures or app UI failures. Third part was, like we said, test automation, unit, component, functional, and integration, which is why I talked about the test pyramid. As in when you move from unit to integration, the test density has to slowly decrease. Test cases have to be more abstract. What they verify has to be extremely precise. I heard somebody speaking in a session today that your test cases are supposed to serve a single purpose. You try to do too many things in a singular test case, you end up with more flaky test cases than more definite ones. What did a toolchain look like? I mean, looking at all of these challenges. Some of these toolchains were pre-existing into the system when we started working on it. Some of them needed a slight tweak, a slight change. Some of them were introduced because we needed them. We primarily use Git. Our source code has always been on Git, obviously a private instance. This is where a lot of our branching mechanisms were forged. We initially had enough discussions. We realized that the standard Git branching model is something we can really exploit, but it's something we weren't using up to its full efficiency. As of today, we do our branching and tagging along with our versioning, but it still does not mean we do Git deploy wherever it's applicable. We ensure that all our deployments are via an artifact. Jenkins is where most of our build pipelines, our build jobs are hosted. It's a fairly standard tool. We did contemplate saying, hey, why don't we go with, you know, Runner or some other tool that's usually available, a hosted tool that we can host our pipelines on and run it. The answer was fairly simple. We first wanted to solve all the problems we would have on first principles. With our first principles, like branching, like a versioning broken, moving to any of their tools cannot solve your problem. And this is one of our key takeaways from our entire exercise. As long as you don't fix your first principles, no tool can help you because any tool you're going to go to will ask you to eventually follow some kind of a pattern. SolarCube is something which we used a lot for doing our code analysis. We did a lot of static analysis, a lot of our test results, both unit integration were essentially hosted there. And actually we built a lot of tools on top of SolarCube as well that allowed us to get a lot of data out of it. It went into a lot of management reports where we could do release, which we could use for release postmortems. We could use it for getting more information out of SolarCube. Not just SolarCube dashboards, but we actually ended up using SolarCube APIs to give us a lot more data. Artifactory is what was our choice for storing artifacts, right? For reasons why we ended up moving to Artifactory was, like we said, we are a diverse tech stack. We have Java, we have Python, and we do dabble with a couple of fancy technologies we'll talk about as well. And we realize that we don't want to have an overhead of maintaining separate repositories. It's easy to have privately hosted repositories, but it's always better to have some kind of a susami knife. Chef is something we still use for config management, especially for deployment to our production systems. It is again something which was a tool that was pre-existing when we started work. We said we are going to let it be, we are going to make sure that the handshake between Artifactory and Chef is as clean as possible. Going with the philosophy of tomorrow, if we do feel the need to replace Chef with another tool, it should not be a problem for us at all. Docker. And that's something that we use a lot today. Around six months ago, we only had one or two teams which were trying to use Docker to have their own test environments. We realized that there's a lot of potential over here. And as of today, we have systems and capabilities that allow people or a developer, beat anybody in the system, to actually generate an entire ecosystem with just one click. You just go ahead and you say, I want a single click production equivalent environment. You click on it and you have the entire environment. The entire environment is wired exactly in the same way as your production systems are. There is absolutely no ambiguity there. The same point in time, the data management layers, the data processing, everything is simply a clone of what's miniaturized clone of your production environment. Coming to the testing bit, we had a pretty interesting problem there to solve as well. When we started as a firm, we had a lot of people who were into manual testing. And as in when time progressed, we realized that we need to have automated tests. When we started writing automated tests, the challenge was not just to have automated tests. The challenge was also to provide growth to the manual testers whom we had in two reasons. One, they knew the system inside out. They had worked on the system so thoroughly, they knew the system inside out. We wanted to leverage their knowledge and have it converted into automated tests first. Rather than saying, we would want to differentiate between the two. So we ended up writing a small layer of cucumber translation onto standard APM test cases, which allows our manual QAs to actually write cucumber feature files. And then we parse them and convert them into APM test cases and they basically run all together. And this is something that we actually use for our regular app level automation. STF is something we started using because we maintain our own device lab. For again, reasons which are both old and new and probably org specific. We have our own device lab that we maintain pretty well using STF and the number keeps on growing. I would actually be more than happy to answer questions at any given point in time than wait till the end of the presentation. So if you folks really have questions, we're more than happy. So basically our notion behind this is fairly simple. Along the entirety of a pipeline, you have certain milestones. You have certain milestones, for example, a piece of code moves in from its own feature branch into a mainline. From a mainline, it probably moves into the master branch or the tag. And when it gets built, we want to simply cut off the probability of saying that on a production machine you can actually go ahead and do a git clone using a deploy key that has its own security risks. And then it becomes very cumbersome, especially if you're using languages like Java because you have to compile it all the time. The compiler version is a bunch of things that you have to take care of. What we do over here is the Chef recipes and cookbooks that we have, they actually pick up their artifacts from the artifact tree repose. So they pull it from there and what we say is the build pipeline will continue working and at the end of it, it will just go ahead and publish an artifact in the artifact tree. And the Chef recipes are the ones who now basically come ahead and pick it up from the artifact tree and deploy it into whichever environment we feel like. So its artifact tree is just a place for handshake for us. Any other? I wouldn't want to say why Jenkins was not used because to be very frank, if you have Chef recipes, nothing stops you from executing Chef recipes from Jenkins. To be very frank, the subsystems that we have built on top of these technologies, like how do people provision their environments and how do they run a bunch of their test cases, how the pipelines are, it is all based on Jenkins. Like we said, Chef is basically one of our legacy systems that we have, something that we've been using even before a lot of this work was done. So as and when we are moving from a continuous integration world to continuous delivery and deployment, we are planning to actually cover it in one of our maps. Also, like I said, we are also playing with a couple of relatively newer and cooler technologies where we actually have to see which way we want to go. We have integrated it for a few teams. Like I said, the way we look at it is we don't, while we have our broad level rules at the entire organization, teams have a liberty of moving on to the pipelines at their own pace because that is how it moves on a microservices base to it. Any other queries on this? Please. When you talk about deployment servers, are you talking about production deployment or are you talking about non-production? So like I mentioned, most of our non-production deployment happens via Docker because it allows us to have a very, very miniature version without putting in a massive effort. That essentially, so if you look at it, all the hosts on which the Docker containers run are nothing but AWS AMIs. So we are, again, heavily AWS focused like most of the startups in any space. And even with Chef when we do production deployments and we do auto scaling, auto scale ups and auto scale downs, those are largely AMIs that are spun up. But what we make sure is whenever we are running containers, the AMIs are just being used as a base kernel image and that's all. So configuration management and infrastructure as code probably from our perspective are not exactly the same things. For us, configuration management to begin with from where we were, again, like I said, it's in our context, was about removing the hard-coded endpoints and other application configurations outside. So when you're spinning up a Docker instance or when you're spinning up an environment using Chef, all you need to do is supply it a configuration file and that needs to be there. You can't say that you're going to hit a single static IP all the time whether you're on Docker, whether you're on Chef, whether you're on production or non-production environment because then that really complicates things. Infrastructure as code is again something that we are, like I said, we are looking into. That's probably a little towards, I'll talk about a little towards the end of the slides because that's part of our v2 that we are trying to do. And that is moving more towards how do we get rid of everything and have things done very, very quickly and have reproducible environments altogether, whether it is in terms of, like all the way starting from your AWS VPCs to your subnets to your AMIs and then eventually code and not just code. So that's how we usually like to define infrastructure as code whereas it's the entire team. Any other questions or queries around this? I'm more than happy to answer them. So we basically started doing Docker as reproducible test environments for non-production. So again, this is another takeaway we had. Doing it on non-production environment definitely has a lot of benefits without you paying any of the major penalties. But when you're running Docker containers in production at scale, you really have to look at each of your individual services as to whether they are going to be latency issues or not, any other performance issues or not, whether the application in itself is written in a way that you can probably cluster it together. For example, if you look at, I'm not sure if you're familiar with Kubernetes. Perfect. Now, when you look at Kubernetes, Kubernetes, its initial releases said that you have to do it for stateless applications. A stateful application based on a Kubernetes architecture is going to bomb massively. We did have a couple of stateful services in our legacy architecture which we said we can't really take it there. So that's a classic case where you wouldn't want to run Docker in production. It's never going to end well for you. So we wanted to be cool kids. We didn't want to be cool kids with cool age. Any other questions? When you define compilation, we do have Jenkins Slays which do our builds and they intern our Docker slaves. So here's the thing, as it goes with deployable services and containers, they have to be immutable. So what we do is when we form a container, we have the binary inside the container, but you don't have the compile time tools. It also makes your image heavy and you won't want that. They have to be really lightweight. Build agent as well as for our services, test environments. So at any given point in time, the way it usually works is when developers, like I said, we wanted to move towards a very strong test-driven approach. So giving that capability to our developers allows them to spin up an environment for themselves, change their tests there and then pass it on onto a build or then raise a PR request or move on to our test teams. Say, hey, this is at least tested good enough. There are enough unit tests. There are enough functional tests or component tests. So it will not just land in your hand and just blow up. So it's more like environment as a service. I saw your hand. How do you? Thanks for asking that. Like I mentioned, probably would have slipped. We usually don't do tech bias, but when it comes to first principles, we are very biased. Versioning is something we do take seriously. And in multiple places, if you look at versioning, we do semantic versioning more or less. On the app side, it's actually a very easy world for us because the PlayStores and the AppStores of the world allow you to stick to that. You'll be able to submit your apps. On the server side of it, what we usually do is in a lot of ways our branching helps us to automate our versioning too. Now, there's a slight contradiction if you look at from how semantic versioning actually works. The semantic versioning spec says that the developer making the change is the one who actually updates the version. But we try that initially and there's a lot of a few developers weren't very comfortable doing it. They said, you know, we probably don't want to get into it right now. But here is a simple baseline. So what we ended up doing was we ended up using our Git branching model. We ended up implementing versioning. So what happens is we have a mainline on which you will always have a particular version. There's a tag that's always symptomatic what's running in the production. Every time you do a hot fix, you check out the tag. The build server has a job to check out the tag, updates the versioning on the hot fix part of it, does the build, tags it again, pushes it back into mainline and moves on. The same is with feature branches. Every time you actually have to promote it, the build server actually, if you guys know Maven release plugin, that's exactly what we do for all our languages. We have a small version bump thing for Python and other script circles. I saw one more hand going up so far. So is it only for static code analysis? Static code analysis for us is just one of the users of SonarCube. We also end up storing most of our test results over there, along with the data. One of the really good uses we actually have for SonarCube is, I'm not sure how many people have ever explored their APIs. They have very rich API set and we use a lot of that API set for our reporting. A lot of times when we do release post-modems or release retrospections, we pull in a lot of data from SonarCube because it gives you a lot of trending data, not just the dashboard data. So what we do is we actually pull in data from Git APIs, from Sonar APIs, with AnarJira. We combine them together and then we basically use it to derive more and more intelligence and information around our releases. So we have a lot of API level stuff that we associate with SonarCube. Not just static code analysis. Any other questions around this? Great. So this is how it really works with us. Any developer in our ecosystem codes a feature. What we do is we actually integrate SonarCube into all the IDEs. We have that capability. So developers don't really need to submit their code to see what the SonarCube analysis would look like. Every time they just hit Command S or Control S, the SonarCube binding just runs it in their IDE and tells them what are the potent code violations that they would be. That would be there on that piece of code. They also run unit and component test cases, which is something on which we over the time slowly have started having a more and more tight SLA on. They run functional tests for the feature. Now the functional tests over here are largely applicable on the app side of it because we use Roboelectric and a bunch of other tools to do that. Once the local dev deems that the feature is good enough for submission to mainline, a PR is raised. The moment a PR is raised, the build and release infra kicks in. The PR is captured via standard webhook. We run the static analysis on the entire change set. What we do is we take the change set, we merge it onto whatever is there in mainline, we run the static analysis on change set, we run all the unit tests. We do a code review. When we actually posted for code review on GitHub, we post all the static analysis comments and the unit test results on the PR as well, so that the person who is doing a code review has enough statistical data and he or she just has to go through the logical part of it. Once the code review is done and the PR is merged, we basically build the mainline for reality. The earlier part that you are looking at, this is a Fox merge. This is part of your pull request builder that you would have. It does not actually merge and come in the code into your repository. But the other one over here is a post merge job. As soon as you merge it, its job is to build out a release candidate all together. The standard philosophies say every time you merge into mainline, you're supposed to generate a release candidate. That's what we essentially do. Once we do the mainline, we run component insanity tests on instrumented code because we also want a lot of coverage data out there. We run functional test cases on instrument code as well after which we run into a nightly build. Nightly build is where we run full regression and when we say full regression, this is most of our app level test cases because these are your APM suits that take APM slash cucumber suits that take a lot of time and we don't want to run them every time a mainline build is done. Every time we do a nightly build we run a full static code analysis not just on the delta of the code change that's coming in. We also do a lot of benchmarking a lot of perf tests and dynamic analysis which is basically code coverage and other data too. And once all of this is done, thing goes into artifact tree. So what we are saying is that we've essentially broken the entire pipeline into four major steps. The very first step usually lies with your local developer. Because the local developer has the tools to be able to spin up an environment to actually have an application with which he can point to the local environment instead of some other server, the developer can actually essentially debug everything on his or her own box. Once that is merged, the entire build and release pipeline. Now if you look at over here, there is absolutely no mention of any particular language or a build tool because this pipeline holds true for everything that we do inside our ecosystem. And that is what we meant that our choice of tools has to be very very simple and they have to be easy trade-offs. Tomorrow if we decide to replace Jenkins with I don't know Bamboo or some other CI runner tool or probably replace artifact tree with Nexus or any other alternative we don't want the entire pipeline to be impacted. We just want that particular change to be simply swapped and we are done. So this is an interesting way on how our nightly build runs. Now if you look at it from a client perspective, the client essentially needs a server environment which always works. So what happens is we actually have a boundary where we say that the server CI kind of comes in. It has a blue-green kind of a state. There is the last known server build that is essentially running and it is exposed to the client for our nightly builds. And there is a separate server on which we take the latest release candidates and keep running the test cases. If the latest server code which is the RC passes through and we say everything is good, we actually swap the two servers. And the client build and we obviously make sure that it does not happen when the client test cases are running but at any point in the day when the client test cases are not running the engineering deployments keep on happening. So in the nightly whenever the client test cases run it is guaranteed that they are always running against the most stable server environment altogether. Potentially this is what summarizes our entire experience with our exercises. First of all DevOps is a massive cultural shift. The amount of effort we had to put in building all of these systems was nothing compared to the kind of effort we had to put in to convince all our stakeholders. Like with every traditional startup we had the usual questions are will this slow us down no this is a lot of process no this isn't but everything works. Why even move this model? And the answers are fairly clear. Point number one, we don't want to call this process, we call this too late. The idea is whenever you say that people have to go through a repeatable process or a repeatable way of producing software certain first principles come into play which is why initially I said no tools going to solve your problem if you don't fix your first principles. The second thing is fairly simple we did realize when you're playing with the big boys you've got to play like the big boys right? You can't be a under 10 cricket player and then expect that you'll play international cricket right? The second is legacy is not a blocker for DevOps you will be surprised that a lot of our services which follow this model today and actually work on it probably are not traditional microservices they don't really have that clean abstraction between you know concerns or dependencies which is why as somebody asked do you guys also run containers in production the answer is we are slowly revamping it to say we want to rearchitect them to say we want to run containers and services but does that mean we are not going to do the stuff we can do with them? Absolutely not. You've got to start somewhere you've got to make small progress iterative progress and then eventually hit the overall continuous deployment goal. The last piece is something that I've emphasized a lot upon is decoupling between different pieces of pipeline all the tools that we have essentially have a standard boundary through which they communicate it allows us for better tuning or customization like I said if you're not happy with Jenkins some given day you can always throw it out bring something else you see you like it also allows you to prototype you can always save a bunch of your services can be moved to another tool altogether and if you see that the competitive benefits are worth it you slowly move the or migrate the entire thing in a monolithic fashion right and it's also very mission centric that tool centric we never ever really wanted to build our dependency on one given tool we didn't want to be extremely dependent on one tool to say oh you're like we're done if this tool goes away if this tool doesn't support us and it does did happen with us by the way we were using Appium and Acocombo framework and Apple did funky stuff and they introduced there was a lot of leakage in our iOS test suites and we had to contend with it a lot of time those of you who've been doing iOS automation using Appium would have also faced the same thing of late well luckily it's on a path to be fixed but then we were also jolted the good part was since we didn't have to put all our eggs into one basket we said okay we can all possibly start exploring other options altogether the last bit is metrics like I said you have metrics all over the place if you don't treat them well they'll become rapid noise for you you'll have trouble making sense of them eventually you'll be very tempted to throw all of it out of the window the idea over here is to pick up metrics probably aggregate them depending upon the level at which you're viewing like we said when we sat for release post modem when we said for release post modems we said for release retrospections we do not look at each individual line or each individual day we actually aggregate data using a lot of APIs do a lot of correlation together and then arrive at what is working and what is not it helped us a lot initially to realize that we weren't able to track all the pull requests that were coming in initially why? developers weren't putting a G-R-I-D to it how did we do that? the moment we saw the trend we said in apia builder if you don't attach a G-R-I-D you will not be able to merge the result was pretty much so a bit of friction saying hey this is process but then over a period of time even they realize that it helps a lot especially when it comes to debugging that's pretty much what the session was all about I'll be more than happy to answer questions if you folks have on continuous deployment on devices we do continuous deployment on our test devices right we have actually so we do maintain our own device lab right and we use it we maintain it using stf so we're using stf and other than that we usually use adb heavily for android right for apple we are now slowly starting to look into what's in it sorry oh yes so device failures are one of the things that we run into we had a homegrown solution earlier that helped us to only communicate to devices that were up and running and off late we have started replacing it with stf right like I said when you have a lab you need to be you can actually put in measures to make sure that your devices are always up and tools like stf make sure that they are the perfect routing medium to make that your test cases run only on the devices that are up any other questions it is purely a function of like I said usually they go to the artifact which is on the release release repository after the nightlies but the snapshots are usually available after the third cycle so once you merge right we take we basically build a release candidate depending upon project to project it could be it basically a function of probably let's say 4 or 5 minutes once you merge into mainline to probably a few more minutes but generating RC's is what is the easier part right RC's don't take that much time especially on the server side of things test cases are usually quick there for us when the longest pole is around 30 minutes which is validation of a big chunk of a server side piece it's not something which is completely into a microservices architecture but like I said we work around it we run around 2000 API test cases on it and if it all looks good we basically say the PR is merged the moment you merge it you will have the release candidate there and then all the testing happens on the release candidate next question there are places where we maintain them both as the actual binary format as well as the docker images so what we do with Jenkins is when we actually build it the Jenkins job produces one let's say suppose a jar or a war and the other is the docker image the only place where they are correlated is they are versioned similar they are versioned similar so if you are building 1.1 of a given jar the docker image will also be called 1.1 and that's how versioning really helps well like I said we have our own limitations we are definitely exploring Kubernetes big time and we really like what it gives us like somebody said infrastructure as code Kubernetes is pretty awesome you maintain a YAML and you can reproduce the entire the entire data center level ecosystem your AMI everything gets yes it is just that Kubernetes in itself is solely for containers Terraform is a more open-ended tool you need not use only containers for Terraform Terraform also works with your AWS, AMIs and other so all the internal all the internal artifacts that we generate go on to the internal any other questions of course interesting question so this is exactly where human bias I don't know if it's useful or not right every time so how do you say that a code is reviewed so you've taken your code you've taken your code so this part right so you take your code you run static analysis you make sure that there are no major increments SonarCube again has multiple levels so there are blockers there majors there minors so on we run all the unit test cases they have to pass and then submit it for code review now technically your static analysis and unit test cases right they have to be in a state where if you run them and you don't find any alarms you say this is good enough but a few teams are not very comfortable for it what it takes for us is just one take onto a Jenkins job that says please merge if you find everything okay so we have the capability to do that some teams request us and say hey you know what you might just want a line in code review there are a couple of teams who are actually comfortable with it and say you know if it's if you see a tick just merge it don't come to us right and there's some teams that say hey we still might want a pair of eyes going through it so like we said it's more about comfort yeah precisely so we are not we don't have the capability what we are saying is this is where we basically listen to stakeholders as well right if there is a stakeholder who says hey probably we are for the timing we are more comfortable with the code review we go with the disclaimer saying this will slow you down you would rather invest that time in writing much more better test cases and making sure that you are not really you know causing more violations or anything but then it's all a symptom so they say no we are more comfortable with it for the time being there were a couple of teams who were initially said we are comfortable doing code reviews later they realized they said oh you know what we already have so many unit test cases we have a good coverage we don't we haven't hired any untoward incident or not much of code command set have come in past two releases let's just go ahead and enable it by default so that's where a lot of release retrospection helps us you pull out all that data from github from sonar you put it together team said together say hey what do we do with this data now does this data mean that we can go ahead and automatically merge yes or no so our job is to build systems that can do things but for us most of it is a switch that we can turn on and off I think we are out of time perfect thanks for being an amazing audience it's definitely a pleasure being here have a good day