 Before we start, let's do a quick show of hands. How many of us believe that continuous integration, C.I., is an integral part of the development process? Please raise your hands up. Okay, cool. Keep your hands up if your C.I. takes less than 30 minutes to produce stable build artifacts after running all the available tests. That's nice. What about 15 minutes? What about 10? Wow, cool. So we have at least two people here who can help us out if you have any questions. If you have any questions, you can reach out to them definitely. In our case, though, around 10 to 12 months ago, our C.I. system used to take around 60 to 90 minutes to produce stable build artifacts after pulling the code, building the code, and running all the available tests. Now, that's a pretty slow time there. And it used to get really bad if the test failed because while the prior build was running, multiple developers would commit their code, and now to go and isolate which commit actually caused the problem and then to fix the commit, it would take four to five hours to get the next set of stable build artifacts. And you can only imagine a situation when there was a mistake in isolating the problem or fixing the problem. It used to be much more longer. Now, clearly, the team was not happy about it because our ability to get quick feedback, to quickly learn from it and to take a course correction, that was getting impacted. Now, having these long-running build cycles also had a specific developer behavior impact. They were discouraged from committing their code frequently because if they commit their code frequently and the build fails, then now they have to spend their time making sure that whether they were the cause of the problem or was it someone else, so it's a waste of time. They were also afraid of committing their code during evening times. You can imagine, right? If I commit my code during evening and if my build fails, my entire evening is lost. So they'll prefer to keep their code on their machines and come next day morning and then commit their code. Now, that leads to other problems. Like, the longer, the more away you are from your repository, it takes longer time to integrate the code with the repository. So it's a waste of time. And God forbid the hard disk crashes when your entire efforts are lost. Now, these are profound business impacts. But apart from these business impacts, there are also some social impacts. Like me, has anyone been in a situation when you had to stay back late night to integrate your code with your colleagues just because your CI system does not exist at the first place or your CI system takes longer time to run? Anyone? Okay, cool. Okay, have you been in a situation when you had to rush back home living behind an unstable build so that you don't miss the dinner with your family or friends? Okay, this is how your colleagues would be looking and pointing at you when that happens. Or have you decided to stay back, make sure that the builds are successful, all the tests are passing, the build artifacts are produced so that your colleagues are happy and not pointing fingers at you? But now, worst, you have to hear your kid cry and say, dad, you're late today again. Are the situations familiar to you? Are they acceptable to you? Understandable. Understandable, okay. So those are not acceptable to me. Like, you know, I didn't want to be in that situation and that is what led me on my quest to reduce our CI build time. Now, this is how our new CI pipeline looks like. That's the first build. It is triggered whenever the code is committed. The only thing that it does is it pulls out the code. Once the code is pulled out, it will trigger the next build. That's our build job. It has the responsibility of, you know, building the code and producing the build artifacts. It could be the war file and the database artifacts. Once the build is done, it triggers two jobs in parallel. The very first one is a deploy job, which picks up the artifacts, deploys the artifacts in their integration environment, and on top of those, on top of that environment, it will run two type of tests in parallel, one of the BDD tests and one of the REST tests. And the other job that you see here, that's again a testing job. What it does is it runs a lot of tests in parallel, which are not dependent on the environment as such. And now overall, we have come down under 10 minutes. And what we are going to see now is, like, you know, how we have actually done this, what principles of REST tests should do this. So let me call him Naresh to actually talk about the key principles here. I think the slide is pretty self-explanatory, so there's not much I need to do, but let's see. I think there are many principles, but if you look at it, I would say there are broadly, you can classify into three core principles. One is basically, you need to identify the bottlenecks. What is the biggest bottleneck and focus on the bottleneck, right? They are always trying to optimize the bottleneck. So key techniques, the first one is to focus on bottlenecks in your build and try and isolate where the bottleneck is. Once you've got that, then you basically figure out how you want to divide and conquer the problem, right? If this is your biggest bottleneck, how can I put things in place so that I can divide and conquer the problem and take it into smaller pieces? Once I've done that, then I want to basically focus on fail fast. I want to set up my build pipeline such that I get feedback as early as possible whenever there is anything that goes wrong. So those are the kind of three broad categories, and next we are going to go into each of these and talk about specific techniques that we have actually used to speed up our build time from 90 to 10. So before that, a quick commercial break. I'm Ashish Parkey. I'm very passionate about solving business problems. I'm working for ideas, a SaaS company. Both ideas and SaaS are into analytics, and at Ideas, we help hotels and the car parking to increase their revenue. Yeah, I'm Naresh Chin. I work at Ideas Alternative Weeks, and we've been helping kind of a whole bunch of things. Build is one part. We have developers with clean code with various other practices. So kind of really focusing on bringing agility in the organization without using the word scrum. Cool. So let's have a look at the first principle, focusing on bottlenecks. Now in your experience, what are those elements or entities in the system that lead to a lot of bottlenecks in whatever systems you have worked with? Any examples that you can cite? Sorry? Yeah, but what leads to it? What leads to it? What leads to it? If you actually go down, right? If you were to focus on the bottlenecks, right? Keep drilling down. What is the things that actually lead to bottlenecks in your business? Remember? Tests. Okay, tests. Why tests? That's our goal actually, reducing the time itself. Why tests are slow? Like keep asking why, till you keep going down, right? Why are tests slow? IO. You're saying database, when it hits database, then your test becomes slow. But when it hits database, why is it slow? It's basically because of the IO, right? So if you were to drill down, IO is one bottleneck, right? That's one aspect. Any others? So configuring environment, which is basically setting it up and bringing it up, right? That takes some amount of time, okay? I mean the knowledge of the people, one of the point I noticed is the knowledge of the environment setup. Like they have put so many things on the server, like not only it's been catering for a build server, but so many other things which is not necessary to be that part of the build server could have been different. So your CI server itself is not getting the CPU and the memory that it needs because other processes are hogging that, okay? That's an easy one to resolve because you know, I think you were saying something. That could lead to increasing time, but I think where you need to further go down is you know, even if I ran my test sequentially, if they were fast enough, I don't care. I mean it would get done really quickly. The problem is that it's doing one of these things that other people have highlighted, like IO and things like that, which is where they are becoming slow. And then running them sequentially makes it even more slower. So in my experience, like you know, disk IO is one of the primary causes of any of the slowness that you see in the system. So if your tests are dependent on a lot of, let's say as an example, file operations, reading files, writing files, reading files, searching files, the tests are going to get slow. As you also rightly pointed out, if the tests are heavily dependent on the database and they're reading and writing continuously from the database, the tests are going to get slow. So these were very obvious problems for us to look at. So we started going through our build scripts and our Jenkins configurations to see if we have something obvious there. And that's where we came up with this, we identified this problem. This is how our earlier build pipeline looked like. The first job which would pull the code, build the code, produce all the artifacts, and the next set of children jobs which are responsible for running the tests. And all these children jobs, they actually required the parent workspace so that it can actually use the artifacts which are produced. Now, when you create a new job in Jenkins, what it does is by default it creates a new workspace for that job. So, naively what we had done was we actually, in every children job, we deleted the workspace and then we copied the workspace of parent there. And that's how we replicated the jobs and we had these file jobs running in parallel. And our parent workspace turned out to be 3GBs. Now, what happened was every time the builds were running, we were unnecessarily going and dropping all the files which are available in the child workspace. And then we were copying the 3GBs of data from the parent workspace. Now, that was taking unnecessary time and that was a very low-hanging code that we had to deal with. That is where we came across one utility by Microsoft. That's called mklink. What it does is it helps you create a shortcut between two folders. So, now I can create a shortcut between my child workspace and my parent workspace and that way I don't have to drop anything or copy anything. That saves me a lot of time. Eventually, we also found that Jenkins itself provides you a custom option or an advanced feature where you can go and say that you can customize the workspace and say that use a parent workspace as your own workspace. And, you know, that saves a lot of time. Another thing that we discovered while we were going through our build script was that there were multiple visdals that we had in our system and those were static visdals. And with every build, we were generating jar files out of those visdals. Now, doing this activity as part of the build process, it was not really necessary because, you know, every time we were taking a build, we were taking that hit of creating jar files. So, we switched away from it. Instead of creating jar files every time we checked in the jar files, we were using ant, we're not using maven, so that was the option that we went with and that saved us some time. As I mentioned earlier, like, you know, Ideas is an analytical company and what we do is, we do forecasting for the hotels. And in this particular one case, one test was running pretty long and after scanning the log files, what we discovered, that particular test was running forecast for the next 10 years. Now, in the context of that test, running forecast for 10 years was not required and you can imagine, right, when we are talking about running 10 years of forecast, then it deals with a lot of CPU, it also deals with a lot of IO and that was a wasted effort. That was an integration test. All we got to make sure was that from Java, we are making a call to one of the services and the service was actually doing the forecasting and was returning back with the results in the database. So, in that context of the test, we were not going and really checking if the forecast was accurate for the next 10 years. That was the scope of the test. So, it was just an integration test. So, within that context, we decided to bring down the number of days for which we are doing the full cast. We brought it down to 15 days and that saved us a lot of time on that specific test. No matter how much we tried, we still landed up in situations where we still had to use IO. Our goal was to eliminate file IO operations as much as possible. But there was one test which relied on a pre-populated database. So, deleting the earlier database, now copying the newer database and then running the test. That is where our DevOps guy came and told us, have you used RoboCopy before? And I was like, no, I don't know what RoboCopy is. That's what he told me. RoboCopy has a feature. What it can do is, if you have a source folder and a destination folder, it can only copy the differences. So, let's say a new file has got added to source. It will only copy that file. Let's say out of 100 files, only one file has changed. Then it will go and copy only that one file. So, instead of blindly going and dropping everything and copying everything, RoboCopy again helped us save a lot of time. And arcing is the unix or the Linux equivalent of RoboCopy. Now, if you're not able to actually reduce your IO timings or disk IO timings, the other alternative is going for SSDs. Now, as you're aware about, SSDs do not have mechanical parts the way and hard disk has it. This is a sample comparison I have and what it shows is the number of IO operations that are possible on a standard hard drive and an SSD. And you can see these numbers are much, much higher on an SSD. But this is how I'll imagine an hard drive and an SSD. I'll imagine a hard drive like a single-end highway where you can start seeing the bottleneck. Whereas I'll imagine the SSD as a five-end highway where my entire traffic can zoom through. Now, SSDs are expensive than hard drive. And in our context, our Jenkins environment was hosted on a virtual VM. And this virtual VM was part of a virtual server. So what it meant was I had to, you know, buy an expensive SSD so that I can fit it into the server that's available. And that also meant that I had to go back and, you know, make budget changes, take a lot of approvals. And I didn't want to get into it. So I started wondering, you know, what else can I do? I can't use SSDs immediately. I can start doing. And that's where, you know, we thought of, you know, checking if we can use, make use of in-memory database. And we are using MySQL in our company. So the obvious choice was the memory engine of MySQL. Now, as the name suggests, this engine is not residing, or this database is not residing on the file system. This database is residing in the memory. Now, memory I.O. is much faster than the disk I.O. So we were hoping that using a memory database would reduce our, you know, speeds very much. But we immediately discovered that there are, you know, limitations with the memory engine of MySQL. Like it doesn't support a couple of data types that we are using from the MySQL engine of MySQL. So we couldn't really use this particular. Also, this would solve the disk I.O. problem, but not necessarily the process. I mean, it's still out of process. Yes. So you still have to take a socket connection. Yeah, technically. So it's not going to give you that huge an advantage, basically. Correct. So like blobs and text, long text, these are not available there. And our analytical engines, they make use of those particularly blobs. And then we came across HyperSQL. Now, as Naresh mentioned, HyperSQL can be installed and used as only in memory or in memory and in process. It has both the options. But what happened was like, you know, we have this product in the market since 2002. We are into software as a service world. So we govern what kind of databases we want to run. And we have been using MySQL since then. And like, you know, a lot of our queries over a period of time have got very native to MySQL to make use of the flexibility provided by MySQL and the performance provided by MySQL. So it was not possible for us to actually change our code just so that, you know, I can put the database into memory or HyperSQL is not supporting all the MySQL native MySQL queries. So our search continued. And like, you know, we came across H2 database. Now, this H2 database has a MySQL dialect. And it supports not many MySQL queries, not many native MySQL queries. But it still didn't support a couple of queries that again we are using, like insert, ignore into or insert on duplicate. So these are not supported by H2 database. So, you know, our search continued. And then we came across MemeSQL. Now, MemeSQL is supposed to be wire compatible with MySQL. What it means is I can simply make use of, I mean, I can put my database into MemeSQL and I can run it and everything is going to work fine. All my native queries are going to work fine, too. Now, you know, that, I mean, that was a good thing. So we went ahead and we thought we'll download MemeSQL and we'll start installing it. But again, what we discovered that MemeSQL is only, you know, it can be only installed on the Linux machine. Now, our entire Jenkins infrastructure is on Windows. Now, just so that, like, you know, we have this advantage, we didn't want to introduce another, you know, another machine in our infrastructure which is not native to what we are doing there. Now, we went through so many alternatives, with every alternative we thought we are almost there, but, like, you know, we were still a bit far away. And I started wondering, like, you know, what next? All I wanted to do was use my own MySQL database. I don't want to change my code at all, so that it runs in memory. And I wanted to run it fast. So, you know, I started wondering if I can do something by which I can put the same database in memory and start using it. And to my surprise, I actually came across something called as RAM drives and a software called as SoftPocket. Now, what it does is, let's say, for example, if you have 32 GBs of RAM, then a software likes SoftPocket, it can actually carve out, let's say, 4 GBs of RAM and create a file or create a drive out of it. So, for all practical purposes, for all the applications running on a machine, the drive will seem like any other normal drive. The only difference is it's not on the file system, but it's there on memory, right? So, it would have run very fast. Now, these are numbers that I have. Like, if you remember the SSD numbers, these are way higher, like, these numbers are way higher than the SSD numbers. This is how I'll imagine a RAM drive. Like, you know, I'll imagine a RAM drive like a massive 10-lane highway that traffic can zoom through and my bottlenecks are gone. And I was pretty excited, right? Because now I didn't have to worry about the native SQL queries that we have written. All we had to do was port mySQL on that particular drive change, my.nl settings, run my tests, and things were going to work fine. And that's exactly what I did, and I was disappointed. I was disappointed because, like, you know, it did not work. The timings did not go down. And, you know, I started wondering why. Like, in theory, I was right. The disk IO is slow. We were having many test cases that were going and hitting the database. They had to be slow. And still, you know, there was no impact. And the only hypothesis that I could make out of it was just maybe, just it could be possible that maybe the traffic that I was thinking, the disk IO traffic that I was thinking is not enough. So no matter how many lane highway I create, it's not going to help me. So the next question was okay, not that disk IO was the problem, and that's not the problem, then what else could be the problem? And that is where we decided to do some CPU profiling. So we decided to do CPU profiling of our test cases to identify where the bottlenecks are there in our system. Now, the image that I have here is from a plug-in called as JVM monitor. It's an eclipse-based plug-in. And what it shows you, two things here. One, the time that the particular method took and the number of times it got executed. So now that gives a decent idea into what's happening with your test cases. And we actually came across a lot of insights that we had never thought of. We never thought that some of the problems that we had in the code were actually there. So this was our first insight. What we discovered that we were using resource bundles, and the resource bundles were checked in the jar files. And when the resource bundles are checked in the jar files, whenever that resource bundle was required, we were going and scanning all the available jar files to find out that resource bundle. Now, that's an expensive activity because, again, the disk IO is coming into picture. So the database IO, I thought database IO was a problem, but the problem was somewhere else. The problem was disk IO in terms of searching files. So the option that we had was to read the resource bundle for the first time and then cache it. So that solved our problem. How many of us are using spring here? A couple of you, okay. So spring context loader, that's another thing that takes a lot of time. So every time the spring loads its context, it has to initialize all the bins available. And many of our test cases were initializing the spring context with every run. And it was taking around five seconds to 10 seconds for loading that spring context. But when you start talking about thousands of tests, certainly the time is huge. And that is what we wanted to get rid of. So we had to refactor all our test cases so that it makes use of the spring context. And that again helped us saving a lot of time there. This was another discovery. Like, you know, in our test cases, we were sending out a lot of emails. And in the context of the specific test, we were not really going and checking if the emails are sent correctly or not, whether the subject is correct or not, tool is correct or not. That was not the scope of those test cases. But just because in the flow, the mail is sent correctly, the test is sent correctly, the test is sent correctly. And in the flow, the mails were getting sent. We were not even aware about it. And again, CPU profiling highlighted that, you know, it is taking time. Sending out email also is an expensive operation. So we decided, okay, we started finding out all such activities that are not necessary to be run in the context of those tests. And we started eliminating. So if it's a test environment, then we got rid of, you know, that particular activity. It's horribly slow. And in our analytics, we are using a lot of the calendar APS to, you know, to do data arithmetic. And what our profiling showed was, like, let's say, 87% of the time in our process was actually spent just doing the data arithmetic, like adding data, subtracting data, you know, comparing data. Now, you know, I couldn't even imagine that there was a problem with our code. Now, if you think, like, we started actually profiling our test cases, but where we landed up was our production code as well. So our test cases are also slow because of some of the components that we were using in our production code. And, you know, this not even, not only, like, you know, was impacting the test runs, but our production batch processes, those were also getting slower because of this problem. So we switched to using Yoda datetime library. And we also started using so that those are working as per expectation. And using Yoda and the deprecated data API, we could bring down that date arithmetic calculation by around 93%. So that was a huge saving that we got. And not only our test, but our production was also getting impacted by that. So we are getting huge benefit on production. How many of us are using Ant? And, okay. So, yeah, we are also still using Ant. And Ant has this option called the two options that Ant has. Now, fork option, when the fork is yes, it indicates Ant that whenever you are creating a J-unit or whenever you are running a J-unit, create a VM for it and then run the test under that VM. Now, if you have not provided the second option, fork mode, the default parameter or the default option of fork mode is per test. What it means is if you have thousands of test cases or test classes for every test class, it's going to create a new VM to run the test cases. Now, again, that's pretty slow, like creating VM so many times. It's very slow. Also, we talked about caching earlier. If you're creating VM per test, then the caching is not going to work. After this discovery, we changed it to fork mode once. And now it creates the VM only once and runs all the tests within that one VM. That started saving this time. This is actually a classic antipattern. You'll see a lot of people recommend that you use fork once. Sorry, fork. You leave the default option so that you can fork and run things in parallel. But what you're losing by doing that is all like Ashish pointed or all the caching and everything. You're losing the advantage of that. This is actually a very bad practice to fork per test. It's better to fork once. Yes. While we were doing CPU profiling, we were still finding that the tests are running slow and we were trying to figure out where. The real profiling did not give that output. But while trying to think about where the problems are, we came across this particular option. If you're just mevan, mevan also has similar configuration. But I think that by default, it forks only once. So let's look at the second principle, divide and conquer. Now, in this world, we have powerful machines. They have a lot of CPUs and every CPU has a lot of cores as well. So the next goal was how we can break down our problem into smaller problems by which we can make use of CPU as much as possible. And if you... This is how the new build pipeline looks like. I talked about multiple jobs running in parallel. So what we have done there is instead of creating one big job which runs all the test cases, we have split down our test cases into multiple jobs which can run in parallel. Now, once you do that, that starts utilizing as many CPUs as possible and as many cores as possible. It speedenups the entire process. There is another option. Jenkins itself has an option of deploying it in master and slave configuration. So if you do not have powerful machines, but if you have many machines, what you can do is you can have a master and you can have multiple slaves and now you can distribute the job across those multiple slaves and run the test. That's again going to save a lot of time. Now, what we talked about here was either using creating parallel jobs or distributing it on different machines. But there is another option. There is a concurrent generator written by Matthew and what it does is let's say if you have a test case or test class which contains 10 scenarios and if those 10 scenarios are actually the test cases are supposed to run in isolation and if that's how your test cases are then using this particular runner, you can run again those tests in parallel. So one test class, multiple test scenarios in it all running in parallel with the help of this particular runner. It's going to save time. The important thing is this is all running inside one JVM. Not across multiple JVM. So you get all the caching benefits. Cool. So let's look at our next principle. Fail fast. All right, so the last one which is the easiest one is the fail fast. We talked about fail fast earlier where we said that basically we want feedback as early as possible and if something goes wrong, it's basically as possible. So few things that we tried out was basically let's look at our build and what is the most important things? What are the items in our build pipeline that's going to give us the fastest feedback or the most important feedback? Certainly building Java docs is not going to give you any benefit. So let's move things like that which were happening earlier as part of the entire build. So we started looking for things which were not going to really give us any meaningful feedback or not the most important feedback and we started differing those later in the build process and we started pulling things that were most important up in the pipeline. One of the things we also looked at is if you look at the tests we had a bunch of tests some were really fast, some were slow, some were terribly slow. Now if you run the build and you just run them all together then you're basically slowing down the fastest feedback tests because of the slower tests. So we were getting those tests into different categories and we would run the fastest running tests first and then run the slower tests later. So that way we're getting feedback as early as possible. Back I think in 2007 I wrote a me and a bunch of other people wrote a framework called protest protest stands for prioritized test. What this does is if you look at today when you run any JUnit test it is pretty stupid, it just goes and executes the test in the same order even though last time when you ran it, test number 15 failed and in spite of that it will always run 1, 2, 3, 4, 5 till 15 and then say oh 15 failed again it's like that's stupid. So you want to basically reprioritize your test dynamically at run time and run tests that fail you want to run the tests that are most likely to fail first. So that's what protest uses a bunch of other techniques as well using dependency if you change class A then all tests that depend on class A would be run first before other tests would be run. So it does bunch of things and that was kind of again another thought process in terms of fail fast right how do you prioritize your test dynamically so that the most likely test will fail fast. Also one of the things you will notice is a lot of people do clean builds every single time now certain builds in your pipeline you don't need to do a clean build you could just do an incremental build and get really fast feedback maybe post that you will run a clean build and make sure that in spite of after cleaning everything it still gives you the same behavior but that's again another technique where you are applying the fail fast principle to say okay I'm going to do incremental build really quickly see if something goes wrong if nothing goes wrong then let's do a clean build and do it but if it's going to fail right there in your incremental build then maybe there's something for us to look at it and validate that. So kind of quickly I know we are running short of time so this is some very quick practices or some very quick techniques we used to fail fast in our build. Thank you. So all these techniques have helped us actually to get down our build pipeline time below 10 minutes. So just to summarize focusing on bottlenecks and looking forward to you know shifting of the bottlenecks so make sure that you know you avoid pile operations as much as possible again in your own context figure out what is optimal database or the data set size. Use as small data set size as possible so that the IO operations are lower use rubo copy or rsync wherever possible that will again speed up your IO operations. RAM drives you know that was a discovery I think that those are nice things if you are in Linux world I think Linux also has a concept called as tempFS and those are again memory drives so you can make use of those drives. CPU profiling is another thing that we want to do. Typically we don't do profiling for the test cases but I think that also helps us uncover a lot of issues within the code itself. Like you know log files doing good kind of logging making sure that you are also going through the logs to uncover what the problems are that's a good practice. The next thing is verifying build tools that you are using and you have to make sure that you are using the right options of it and in your context see what's appropriate to you that's going to help you saving a lot of time. Basically one point Ashish didn't put in there which is a constant debate between us is that if you were on Linux most of these problems would have been solved automatically that you didn't have to struggle so much but some people are still stuck with windows. Yeah that's our leader. So next is dividing in conquer. Let's talk about how you can use different configurations of Jenkins and the JVNetrunner to divide your tests and run those tests in parallel. The last thing is again failing pass. If there is a slightest probability of failure you want to fail as soon as possible so that you have enough time to recover and try again. So you want to make sure that you restructure your code way or your test cases, the build pipeline so that you get that quicker feedback. So with all this this is our build time versus what build graph looks like. So about 10-12 months ago our build times were increasing and this is our number of builds that occurred at that point of time. And as we started fixing different problems all these problems were not fixed in one day. Time for us to identify and fix those. So as that happened our number of builds they started going up. So the very first big dip that you see here was because of getting rid of the workspace duplication and the fork option RAM drives. The next dip that you see is because of caching the resources, spring context and avoiding unnecessary activities like emails. The third dip that you see there is by using YORA real-time and deprecated DTPI, Java DTPI. And the last dip that you see there is running test concurrently using the concurrent JVNL printer. Again we have not used that thoroughly. We still have to go back and make changes to our test cases so that those are isolated. We have a couple of tests which are dependent on each other. You have to work on those, you have to re-factor those and then we will be able to make use of the JVNL printer too. So the important point in the last slide is that you will notice as your build time keeps going down the number of times people are willing to check in code daily keeps going up. This was theory when we started and now we actually have real data from the project where we can say this is actually helping people. There is a big difference in terms of selling this to the business stakeholders as well. Now you have these tools available. I don't believe. As we started discovering things to implement it now you are aware about these things you can pick and choose in your own context. I think you should profile first and see where your bottlenecks are instead of trying to apply things randomly. You might find something completely different that we have not seen yet. It went for about 4-5 months. That never happens. Just to clarify that never happens as in we don't even waste time convincing business about things like this. When it took 4-5 months it basically means half an hour here half an hour there. Whenever you are taking a break from your regular work spend half an hour on this. If we focused and sat on this it would have been much shorter. What we do is we also run hackathons. Our hackathon every quarter. That's another time that we get every time where we can do anything that we want in that particular hackathon and help add more value to the company. Yes. 24 hours. 24 continuous hours. In that 1-24 hour cycle we have done a lot of things which have contributed to our own... In the previous conference we actually talked about a game that we built for how you are performing. That also came out of the hackathon. We've actually open sourced a bunch of these things so you could actually look at that as well. Those are different kind of tests that we have there. Some of them are unit, some of them are BDD tests some of them are RSS tests some of them are integration tests workflow tests. That's why when we are saying there are different tests taking different amounts of time because all of these different categories. All the jobs that you see here, those are different type of tests that we have. So obviously after with all this the developers are really happy, the colleagues are happy they are able to commit the code frequently they are no more afraid to commit the code during evening time and most important family is happy to be able to go back home early. All this was not possible without a lot of references. We had to dig into a lot of problems there are not many references available. The slides will be available for you so you can so we don't have a build engineer so the team takes care of it. Ashish is the team so he is kind of responsible for the entire team and then developers testers work together under him. This goes all the way. Yeah, so like the Jenkins that's our integration environment like you know we are actually trying to go towards CD. So what we are doing is we are trying to automate a bunch of the installation as well like you know the deployment installation. So our plan is that the same job can install the entire application on your local development machine on integration machine, on stage environment, on production. So there is a consistency across all the different environments so that's where we are heading towards. And for some of our products we have already done that for the product under consideration we haven't done that yet. So our basic problem was time. Because time was the problem we first attacked the CPU part of it. We actually did look at memory we did see and we we saw that memory was actually not an issue we could still see that you know based on the RAM that we had in that particular server we were never actually running out of the RAM and not even hitting this thing like above 75% so we said that's not something we want to even focus on. So we eliminated that very quickly basically. I missed the question. No, no, no, we didn't get into the details the lowest level of it, no. So that's what like the goal I mean the idea there was like you know we started with some presumptions that you know that is what the problem is going to be. And we ended up finding different problems which are not you know like let's say the disk IO we thought hitting to database a lot is a problem. We assumed that and that's how we started focusing on the problem. But after doing the profiling the problem shifted somewhere else. The database IO itself was not a problem. We spent a whole bunch of time trying to get everything in memory and then still not seeing the results. That's when we went doing CPU profiling. We have like you know if you have was that session yesterday? There was a session yesterday or day before yesterday. It talked about test pyramid and what our focus is that at the UI level we want a very less test cases and the only job of this test cases is to go and do the navigation test whether I'm able to navigate from one screen to another screen properly or not and whether there are any crashes or not. Otherwise like you know most of those UI tests are slower because at that level we try and do a lot of data tests as well whether the data is showing properly or not and it starts hitting the database. So that's where it becomes slower. So we did have a bunch of UI tests when we started but there's another initiative we've been focusing on is to essentially get the test pyramid right. So we've been pushing a lot of those tests to the lower layer and we've kind of reduced that quite a bit. We still have a bunch of them left but we're kind of reducing it further down. So hopefully the 10 will go down to 5 at some point. Jenkins has a built-in the Jenkins has a plugin which allows you to do the master slave. Most CI environments have that these days. Team city has that Jenkins has that. Most of them have a master slave configuration. It's a plugin basically into Jenkins. JVM monitor. That's the Eclipse plugin that we use here. We'll go here quickly. That seems like too much enough effort to take a legacy project which was built in 2002 and try and introduce item potency into the entire build process. I think it's going to take a while. I mean there are low hanging fruits that we just went after to start with. No, so like the first environment that we had, we had it in our office, eventually we shifted our environment to our data center in San and we had SSDs at the tier 1 level itself. So after we moved there the RAM drives were not required and as the profiling suggested, I think the database operations were not the problem. The problem were other file operations that we were doing. So it didn't help. We have sooner. We have integration with sooner, yes. But what we do is our inspection job, it doesn't run with every build. It runs once per day. It's one of those things that we talked about. Do you want to run sooner every single time a developer checks in? Or do you want to run it once in a while? We are not actively going and looking every single time at sonar. We look at it once in a while. Maybe once a day or something. So we don't want to spend that additional CPU cycles trying to do the sonar running sonar on it. Why do you say that? We have achieved it. We are running all those tests right now. The entire build per pipeline is per check-in. Otherwise you are not getting the feedback quickly. We are prioritizing which tests to run in what order. But we do want to run all the tests every time a developer checks in code. That's very important for us. As Ashish said, we are moving towards continuous deployment where we do want to basically automate the entire cycle. We are not going to have another set of tests that runs nightly or things like that. So wherever we feed this bottleneck, we want to look at those tests. We want to refactor those tests or dismantle them. You should do with Jenkins the standard tools. Jenkins, yeah. Because we are a window shop, we are using a lot of power shell. We have Jenkins for scheduling and triggering and orchestration. In between we have power shell to do window specific activities. It solves part of the problem but not the entire CD problem though. Next year. All right. Thank you. Can we run short of time? Thank you very much.