 It's endeavor to push on green to give you a brief on what indeed does indeed is the world's Sorry, am I not audible? Okay, I take it. I'm audible Okay, so indeed is world's number one job site. We help people get jobs. That's our mission we are spread across 14 countries and 29 cities over the globe and This is some brief stats to give you an estimate of the scale that we deal with 250 million unique monthly visitors 150 million resumes 180 million ratings and reviews 25 million jobs How different are we from the rest of the job portal firstly? We are not a job portal We are a job search engine you search for a job click on it apply interview hire Those are the five steps to get a job and we put job seeker first before any other thing in our Organization and what do I do I work with the apply block over here So I work with the indeed apply team where we handle the complete flow right from the point a job seeker clicks on apply Till the application has been delivered to the employer Now coming to today's talk. It is about push on green the agenda is divided into four Schedules today one is what is push on green? Why didn't we need it? How did we do it and what are our key takeaways from it? What is push on green? Push on green is also known as continuous deployment or rather the other way around continuous deployment is also known as push on green in the CI CD world continuous integration continuous deployment continuous delivery are three different important categories or aspects we consider and today we're going to talk about continuous deployment in this world What exactly continuous deployment means if you're confident with your code? Deploy it without any manual intervention. That is what exactly continuous deployment means Now why do we need that what is wrong with manual testing or what is wrong with existing processes that most of us follow To give you a context of this things let's understand. How did we grow as an organization? We grow from somewhere less than 500 past in 2012 to somewhere around 9,500 employees today We had an explosive growth in terms of the number of engineers working for Indeed and the number of users using Indeed Now what did that Get us into we had a long list of features to be delvered or products here to be delvered At the same time the number of as the number of members increase in a team You'll also see some issues popping up issues like if you merge 8 to 10 Requests feature requests into one QA branch. What would happen? You're obviously get into conflicts how to resolve that how to prioritize that see you're going to run into some Inevitable issues with the large teams. Now. How do we solve that? We can't have the engineers spend their valuable time on this These are mostly manual operational tasks that can simply be automated Though the effort that would take to automate this is large in the initial stages, but we soon see the fruitful Fruitful responses results of it in the future What would happen when I automate this obviously it would increase our engineering velocity by that I mean the more productive work that an engineer would be doing is increased How does it? Engineer velocity is fine. But what is it contributing to the product? The product quality is enhanced when we grow with this initiative How does that happen? Manual testing is ultimately a human judgment that we are going to rely on and human testing is always error-prone Error-prone meaning if two people are working on the same testing There could be different perceptions of the test scenario or there could be different rigor From deployment to deployment. It may not be the same rigor that we Show while testing the product hence the consistency differs Now to give you a small sneak peek on What did we achieve doing this one of our project deployment time has come down from 98 percent? Sorry by 98 percent from 32 hours to 32 minutes Though this number is very large and we understand It's not possible for every project, but this is something really good that we started off with and that we achieve So we'd just like to show it off But on an average we have observed that teams who have taken up with this initiative have Decreased their deployment times from 30 percent to 50 percent That's really a good number if you consider it in terms of the engineers time Now we understand what is push on green? Why push on green? But how do we achieve this? How do we get it rolling with the existing code base with the existing systems in place? It's all about the confidence you need to have in your entire deployment change Deployment change right from when I say the deployment chain It means right from your coding tell the point you put your code in production and monitor it How do we build that in confidence because now we are all used to manual testing and we are all used to having a human judgment in place How do you let go of that judgment? How do you completely rely on your automated systems? This are some baby steps that we need to be taking for the push on green initiative To gain confidence in your continuous deployment Test coverage is one of the important thing and to have the test coverage We cannot have large deployables large deployables are always time-consuming to deploy our error prone So we need to have small deployables reasonably small deployables and then comes unit us There's a myth or there is a wrong assumption amongst most of us engineers Having X percentage of unit us will ensure us a good stability of the system Rather from my experience, I would say any X percentage or any defined percentage of unit us is not going to give the conference It is all about how confident is an engineer About his or her own code and the systems around it to monitor your environment Some projects may need 80% of test coverage some may need 95. It is here as a team your Collective effort that you collaborate with each other discuss and understand the complexity of your product and come up with the number That may fit your team better One size may not fit all Integration tests something that in a very rapidly changing environment most of us let go of this Integration tests are not mostly used but if you want to save time ultimately if you want to Do a better use of your time? Integration tests are very much needed. How do you rely on your downstream systems? Or how do you test your failure scenarios? It's all based on the integration tests and Then adequate logging Logging is one common thing that I've seen varies from engineer to engineer in all my career either we see lot of logging excessive logging or Scares logging I would say excessive logging is still fine if you can afford for the infrastructure than scarce logging Because logs are ultimately the only evidence and witness to whatever happens within your application Now having said how to achieve it or how to gain the confidence in the deployables But how do you verify your changes? How do you replace? What is existing? How do you change your roles and responsibilities within your team? Developer as the name indicates is responsible primarily for the feature development and then unit us Unit us are the only way to verify any small piece of functionality in isolation Then come integration tests. This is a slight change in the existing process for most of the teams We want a developer to write the integration tests Meaning get creative write stubs or mocks or whatever you think is necessary To perform an integration test from an end-to-end within your own ecosystem Then what does a QA do if developer writes all the tests and the devil the code What is the QA's responsibility? Can we get rid of QA? No our on our answer to this question is manual testing is the responsibility of QA in the past But today it is to improve the robustness of the product Functional testing can be automated by the developer But when it comes to load testing performance testing or stability testing or cross browser testing, there are multiple other phases to the product They will be performed by the QA. The QA is going to improve the robustness of the product They're going to be the gatekeepers of our product quality. There are no more the manual testers They the title truly indicates at quality assurance. So we're going to stick to their title as quality assurance Next comes how do we verify this deployments? Now that we've divided the responsibilities. How do we automate the verification? There are certain basic things that you have to inculcate into your application for it to be completely automated First thing is health checks Every system may depend on one or the other external servers Health checks are a way to identify whether your downstream systems are working properly or not while the startup of our application We at Indeed have decided to come up with three different levels of health checks. One is weak strong and required We indicate that there is a system. There is a dependency that has failed But the application can still run with it We may want to log it and just leave it at there for a developer to come back and see Strong is we would send proactive alerts to the developer. That's one. So dependency has failed But we don't want the application to stop. We would still run the application and Then there's a third level required Just fail the deployment altogether. We can't run the application without this downstream system So these three levels have really helped us in segregating or prioritizing our concerns with the deployments I'll rather I'd say separate our concerns on the level of our dependencies Then exception in logs as is mentioned earlier with adequate logs We are now able to understand where an Exception has occurred or what is that anomaly within the deployment? Now having said that we also need some dashboards to know the system level metrics the application level metrics Meaning we want to monitor all this CPU utilization or memory utilization or response times Now, how do we do that? We use data doc dashboard for that It is a third-party tool that we use to analyze all our metrics from system level to the application level Then we use canary analysis For teams which are chain which have their product changing very rapidly They would want to go with canary analysis. What does canary analysis mean? We deploy the latest version of your deployable to only Very small subset of the production receive the live traffic on to this latest version and analyze the performance of the deployable if you think the Deployable is good to go with the rest of the production then we promote it to the rest of the production But if you think there's an odd behavior or there's an unhealthy behavior of the application then we would revert back As an organization we support this as well as blue-green deployments For teams which undergo changes very rapidly and which cannot Afford to have a lot of other infrastructure. We'd suggest go with canary analysis But the teams who have stabilized for teams like us who do not want to risk Any customer interfacing products we go with blue-green deployment Now these are ways where we Perform the verification. So if any of this verification fail, what do we do? It is automatically rolled back That is all about continuous deployment. If you fail with one of your verifications, you're going to roll it back Now all this is good in theory. How do we do it? We already have a stable almost a mature product. How do we Stay at this stage that we want to adopt this completely automated testing You can't do it at a full go at one stretch Now We have to take an incremental approach to it. So for that reason what we have come up with is We rely heavily on Jira for our process. So we've said we are going to integrate Jira with our push-on-green initiative Any team who wants push-on-green to be taken up can set up their deployment pipeline according to the standards We have set in place have this test come up and then on the ticket You can mention a label which would indicate the pipeline that this particular ticket would be deployed automatically When I say that that means whenever a developer has coded it tested it If he or she labels the ticket with this label and puts it in a pending merge state Then we're going to initiate an automatic deployment Once the teams have enough confidence that automatic deployments are good By that I mean they have set up in place all the verifications all the required verifications all the monitors and Whatever we've seen earlier in the verification steps, then we go for push-all full push-on green What are the key takeaways? We have increased the product quality. There's no room for inconsistent testing as I mentioned earlier We have increased the reliability Kennedy analysis and other techniques have helped us increase the reliability of our product and Then the operational efficiency. This is very crucial. We no more have contention to merge to QA. We no more have to resolve conflicts Then The last or the ultimate thing that we've achieved is engineering velocity has been increased drastically Thanks for the talk Anyone any questions So in the process of push-on green you mentioned that you completely removed the manual testing process So is it like completely removed because only a human being can give new new new testing scenarios and all Because automation can do only repeated scenarios. So how did you achieve it? Yes So as I mentioned earlier, there are two parts to it the teams which are in process of increasing their push-on green coverage We still have manual intervention where a QA engineer or a developer would pitch in and Write scripts to automate every scenario that they would want to execute. Okay, so they produce new scenarios also Yes, so these scripts don't get generated automatically, right? We have QA or the developer writing these scripts So any any test scenario that you think has to be executed has to be in the form of a script There's no manual testing over there. Sure. Thank you Amul, yeah, I'm a audible. Yes, okay You share the best practice about the continuous deployments in your project experience I want to ask Do you see any challenges in terms of canary deployments and also your release cycle? I hope you have two weeks or three weeks of Release cycle of your application, right? So the whole point of moving to push-on green was to decrease this release cycle We now no more have fixed release cycles But we in fact have teams who do multiple times the deployments in a day and we have teams At least my team does twice a day. So that's where we've moved to with automatic Continuous deployment. The whole point of it is you don't have multiple Features going into the same deployment so that you don't run into issues if there's a deployment failure You know exactly which feature has been and what to roll back So we've now done away with the release cycle concept. So you can you recollect me the second question you had in mind? It was about the release cycle So, yeah, now we don't have the release cycles anymore. And how did we get to push on green the obstacles? Yes The teams who have already who are already in the critical path meaning who are customer-facing ultimately for teams like us Any change we make is going to be very much visible to the job seeker So for teams like this We choose to go with blue-green deployment rather than canary analysis because we don't even want that one percentage of users to Be affected but for the teams who can afford that risk or for the teams who can take that risk of Having a duration of test they do go with canary analysis So we do that is the reason we support both of these are deployment techniques. Does that answer your question? Thank you Hi, hi, I'm Olya. This is Anup here. I have a question in the case of a blue-green deployment and you know you push a deployment to the production and It involves a database change for example There's a table entry that you make now There's some problem and I want to revert it But in the actual productions another you have a few data which came into your system during that period of time Now you want to reward the reward those changes. How would you do that? The whole point of blue-green deployment is to avoid running into this situation. So let me first define blue-green deployment Blue in in a blue-green deployment. We would deploy to us a new set of instances and Observe the behavior of the deployable for the let's say n number of instances This would be the same number that are going with the current live traffic now once we think that the deployable is stable That's when we are going to route the traffic to this new set of Instances and destroy the old instances so in this case Whenever you make a DB entry, there's no new traffic coming on to these new Instances, it's only going to be our test that we run on these new instances to observe the behavior Had it been a canary analysis It would have been a different Thing to do. Does it answer your question? Thank you