 Welcome everyone to today's talk which is about data-informed transformation improving performance of agile team using static analysis by Naresh and Trinam without further delay over to you Naresh and Trinam. Great to see all the familiar names unfortunately we can't see your faces but we can see the names and good to see all of you thanks for joining in. This is going to be a 60 minute session so Shriram and I have been partnering at a client and you know some of this will be based off that and some of this is based on the previous experiences we've had. So we'll kind of try and hit all the key lessons so that it is useful for folks in the journey that we've gone through and it's still an ongoing journey we've not arrived at the destination and I don't think there is a thing called destination in this evolving world. Let me quickly share my slides and get started here. Alright we also want to thank Rakesh who I do see in the audience but Rakesh has been an instrumental part of helping this work so it is kind of a joint presentation between Shriram, Rakesh and myself. Shriram you want to say a couple of words about yourself. Sure am I audible? Yeah. Okay yeah hello everybody I for the last one year or so I have been an independent consultant but prior to that I was with ThoughtWorks for a long time and when I quit ThoughtWorks I was a VP of transformation advisory advising clients on improving the performance of their organizations. In the context of digital transformation but along the topics of you know topics of moving from projects to products, organization design, metrics and those kind of things so there's quite a bit of variety in the kind of consulting I do. And this particular engagement you know I've been working with Narayesh and Rakesh for the last several months and it has gone through you know it has evolved the work has evolved and yes here the context is more around software delivery you know a very like a data informed look at software delivery and trying to extract patterns out of that data and then use that to inform the future of the transformation efforts. So I've enjoyed this journey so far and I'm excited to be sharing our experience so far with all of you. Thank you. Absolutely. Alright thanks Shriram for that introduction let's dive in straight into the topic. We've broken this broadly into three, you know, three sections the first one we'd set some background in the context of what we're trying to do. Then we'll look at some analysis that was actually done and what were the challenges we faced and then finally we'll get into some lessons learned that you know when we applied actual analysis. What did we learn from it and maybe some of the lessons that other folks can apply in their own context. Similarly when we talk about improving performance of organizations as specifically from a delivery software delivery perspective, you know, CXOs would typically talk about hey I want faster delivery. I want more reliable delivery I want friction less delivery which means I want things to seamlessly flow through in the organization and not you know keep getting stuck and you know things like that. So when we look at the common performance improvements objectives, you know, they may be like these three at the top level for a lot of organizations. And we kind of in this presentation will be like break it down a little bit and kind of get into, you know, a few more next levels inside this and how do we then use data to make informed decisions around improving or introducing interventions in as part of the transformation. Again just a disclaimer here that this is, you know, this is for folks involved in the large scale software delivery. I think in our case we are looking at upwards of 40,000 engineers working on things. And so it's a fairly large scale software delivery effort here. And the, I think Shriram had a talk earlier today where he talked about basically the impact on business outcomes and things like that so in this particular talk we've intentionally not going to focus on the business outcome side of things but purely on the developer engineering and delivery side of things, which also I think is a very important area that needs to be dealt into so you know if you miss Shriram's earlier talk it will be recorded available you can have a look at that. So I think just quickly jumping in. Most of you might be either driving some kind of transformation or involved in some kind of a transformation that is why I'm hoping you're part of this session or you've joined this session. So any kind of interventions that one may do when when you go into an organization, or you're trying to basically, you know, transform a team. You know you might look at introducing certain practices like scrums, scrum of scrums, PI planning, you know, several other technical practices listed over here. It's like a menu full of practices at your disposal that you could use as ways to, you know, influence teams and kind of try and introduce some interventions in the ways of working that they have. You know, the question that we asked ourselves is that how do we know what you know what teams could benefit, you know, from these kinds of interventions right. How do we kind of figure that out. Usually, you know, we just expect that everyone will adopt everything and we will assign some kind of fluency or some kind of maturity rating to the teams, and basically get them to check off the boxes, go through a series of training, etc, etc, and like that, unfortunately is the state of transformation in a lot of places. And I think that's a little bit of a disservice in my opinion, because if we don't really have data to back what we are doing and we're not using that as a way to drive certain things, we could just be very what's the word, very, you know, we could just like be very prescriptive about it and we could have this attitude of one size fits all. But we all know that, you know, the very essence of agile is that, you know, one size does not fit all each team has its context and things have to be, you know, as per their specific context and I think we've heard this from Andy stock yesterday as well. And so what happens is typically people start off and then at some point the CXOs will wake up and they will ask, hey, you know, has the, we originally started with, you know, faster delivery or more reliable delivery or fiction less, as this transformation helped us achieve this. And, and people feel like they're caught off guard when such questions are asked, or maybe, you know, even people question how do you even quantify things like this because it's a transformation and so forth. So we want to kind of deep dive a little bit into it in terms of how teams could actually approach this and how they could show to the CXOs that the transformation are in fact, helping achieve the objectives that they are trying to drive from a software delivery perspective. Of course, you know, you require analysis and you need to deep dive to be able to do this but before we jump in, right, let me kind of quickly step back a little bit and just look at what what I would say is the, you know, your overall, you know, from idea to cash or idea to go live right. If you look at the overall thing that's kind of what we call as the lead time right and basically from idea to cash, various stages which would have, you know, some activity some, some work centers and then some weight stages in between. Specifically, you could visualize this but the one that we are specifically interested in this particular talk is this little blue or whatever this color is development box. So let's zoom in into into that a little bit. And basically, see what that is. But also, you know, I know a lot of you might be thinking hey this is, you know, agile software development is not like a linear waterfall kind of a model it should be iterative like these are all intermixed and there are feedback loops going back and forth, which is absolutely right and that's that's how it should be but anyone who's worked in large, you know, software delivery organizations will realize that unfortunately it's it effectively ends up being linear, not necessarily as iterative as you would like. So if we kind of zoom in a little bit and then we'll define couple of more terminologies here just so we're all on the same page is if you zoom into the development thing you will see that, you know, there is some amount of discovery that needs to happen then if the solutioning happens then you do your planning then the actual development and in sprint automation, and then you would have things like, you know, integration testing and other kinds of chaos testing and reliability testing and finally, you have to wait to get into production So that's the that's the area that we're going to double click today and so this entire cycle is what we would call as or anybody would call as the delivery lead time from the discovery to actually up to going live. And even within that if you further zoom in, it's basically once the dev teams are done, right, once the developers are done, till the time it actually goes live is what we call as the change lead time again for folks who've read Dora reports will be familiar with this and that's kind of the area we want to further zoom in today and look at it but you know the question we have is, you know, have these times improved by our interventions, and if they are improved then can we quantify by how much, you know, has those improvements happen. That's the kind of question that we want to answer and be able to answer to the CXO so they understand that the investment they are making in terms of the transformation is actually fruitful, a reasonable ROI for them. So let's first look at the question number one right has these times improved and by how much to be able to answer this question we have two prerequisites. The first prerequisite is basically we need to be able to establish a baseline or some kind of historical data against which you would be able to compare and say whether it's improved or not and if it's improved by how much it is improved. And also when we are doing this we should be aware that we need to do a like to like comparison or like for like comparison right, what do I mean by that is let's take CLT for example the change lead time that we were talking. If you had a certain feature and you could measure the CLT for that then you could say hey the CLT is dependent on the size of the feature right because different sizes features may require different amount of CLT. It may depend on in what kind of release train or release bundle it's actually getting shipped out. How many bugs were found how much the testing time went in and once bugs were found how much effort went in from the dev point of view. So you might say hey here are a few of the factors that influence the CLT. And now I want to compare two features, maybe three months apart. Right, so if you just compared the two features as is without looking at some of these factors in terms of size and in terms of the bugs and so forth. Then it may not be a like for like comparison right so you probably need to normalize the CLT. Again I've just given a simple formula here but it could be something more complicated. But the point here that we're trying to make is, these are the prerequisites these are the things one needs to think about. You know, while trying to answer the first question, which is, have these, you know the time the CLT times and the delivery lead times have they actually improved. You know you should be able to establish a baseline or you need to establish a baseline and you need to have a like for like comparison. I'll jump ahead a little bit and I will talk about the second question which is I think even more interesting as you get into it. Right. How much of the reduction is due to the intervention that we are doing right like you might be introducing a set of practices, and you might want to say okay by introducing these practices how much have I reduced the change lead time. And sometimes that number does not move as quickly as you would like it sometimes takes time. So you might say okay let me kind of decompose that a little bit into and try and understand what all actually, you know, does entail for a CLT to be a certain number and then you might say hey you know the amount of time it takes for integration testing to happen is is one factor that goes in right the amount of time it takes to fix the bugs the amount of time it takes for actually deploying your changes, and the amount of waiting in all this process is where the friction piece that we were talking earlier comes in. So you might say hey I need to be able to look at all of these areas to be able to understand how the CLT is getting impacted. Now, as you start breaking this down one of the things you will realize is that you know then these guys are then further dependent on a few more attributes right a few more factors, and you slowly start seeing a kind of a contribution tree, building out from here. Right, which is basically change lead is dependent on some things and those things are then further dependent on. So if you take the integration testing time, one of the factors that will influence integration testing time is the size of the feature, the extent of automation that you have currently right how available and productive are the testers to work on that particular features testing and so forth again this is not an exhaustive list in fact we have a much larger list of things that can influence and the contribution tree. This is just presented a subset of it is presented here just so you understand the concept right. And then when you start looking at each of these you could think of like hey I can I can introduce I can I can do certain things to improve these things right so you could you could say hey I need to have an initiative to shift left things. Or I need to invest more in terms of test automation I need to hire an upskill people, I need to reduce my back size and so forth right or invest in better test environment fully automated ephemeral environments, etc etc. So you could start saying okay I can, you know, further, you know, introduce these kinds of concepts into the organization to improve some of these factors that influence the CLT. And you could then think of specific practices or techniques that could help with this right like for example if feature size is a factor we can we all know that you know we would like to introduce feature slicing. So from a shift left point of view, of course continuous integration is important but you might also want to do contract testing you might want to do other kinds of things. To reduce the batch size you may also want to introduce practices like independent deployment I think Nilesh and I spoke earlier today about how we using feature hub as ways to influence independent deployment and so forth right. So you can then add the last layer of this contribution tree in terms of several practices that you could introduce to influence this right. So far, I hope everyone's with me in terms of how you start thinking about any kind of a metric and breaking it down into its contribution tree. And of course the question is how much of improvement is due to our intervention, right, and perhaps more importantly, it's not while understanding the historical data and understanding what has been influencing what is the relationship between these factors for this particular organization and for this particular team, what factors play more weightage have a more weightage than something else would be important to understand. But I would argue that, you know, it would be even more important the whole reason you're trying to do this is you want to figure out in future. Where should you invest, where should you focus right what is going to give you the biggest bang for the buck. So you want to know, you know what should be your focus areas in the future based on this data and based on this analysis right like this is kind of where, at least we were headed and we were trying to figure out these questions and unfortunately this is not a there's no silver bullet answer here there is no simple follow the book kind of a recipe unfortunately in this space. So this is where I think we turn to statistical measures and statistical methods and I'd request now Shriram to quickly take over and walk us through this part of the journey. Thank you, nourish me just bring up my screen and turn off this floating controls. Is this visible nourish. Yes, all good. All good. Okay. So yes, what we saw is that there are multiple factors that contribute to the to the metrics that matter at the top of the tree. And because there are multiple factors, you know, this potential to use some statistics to understand the contribution of those factors. Of course, that requires data. And most likely if you are a large scale, you know, if you're a large scale software setup, most likely you're using something like Jira or something like that. And so that becomes the source of data. And, you know, when you're doing statistical analysis you need enough data points right so per team or per portfolio if you're trying to do this you need at least I would say at least 100 data points. So, and let's say let's say you have that data right or you can obtain that data because I think this kind of data it's reasonable to expect that you can obtain it from your systems right like how long did it take after it was marked as development complete from that time how long did it take to go live. What was the feature size you basically total the points of all the stories in that feature. You know, in whichever bundle it was released what was the bundle size of that so you know if one feature was released independently then that's a bundle size of one. If multiple were released together in one bundle, then you'll have a bundle size greater than one. How many bugs were reported in the in after the sprint after the development complete stage so during integration testing or other kinds of testing the latest stages of testing. How long how much how long did it take like how many tests testing days were required to perform that kind of testing and how many developer days were required to fix the bugs that were reported right. If you I think this is I mean again I this might not be readily available from your systems but with some amount of you know custom reporting and data manipulation, you should be able to arrive at this picture. So once you have this picture, what's next, then we can run a statistical analysis in particular that's called as a multiple regression analysis. Now, unfortunately, given the nature of this talk, we expect you to have some knowledge of statistics. We, you know, we don't have the time here to give to explain what all these analysis mean. So, you know, so we're not. This is not like a statistics tutorial. Yeah, so I'm hoping that at least some of you will have some familiarity with this kind of analysis. So, basically what we do over there is we identify the what is the what is the dependent variable, which is CLT is a variable that is dependent on one or more of these factors right we don't know exactly how they are dependent in the case of our teams. But logically, using our experience and our, you know, knowledge of software delivery, we know that the time it takes from development complete to go live would depend on, you know, the size of the feature, the size of the bundle, the number of bugs that were found and the effort it took to test and develop. There is some relationships between these variables itself and we will come to that later. As long as the relationship is not too strong, we can still model them as independent variables. And so, you know, before we get to the results of the analysis, we need to ensure what do we need to ensure to do a proper analysis. One is we have to normalize all the variables all the data ranges they have to be normalized within the same range, so that when you get the output of the regression, the coefficients are comparable. And we have to make sure that there is that there is no multi-colinearity between the independent variables that we are modeling. That is the sort of preparation stage. Once you run the analysis, we have to validate that, you know, before we can interpret the results, we have to validate that it is, it is meaningful. And for that, we use two measures primarily the p values and the adjusted r square in case of multiple regression. It's also referred to as statistical significance and explanatory power. That is one part. Secondly, we also run some predictions and, you know, before that we actually split our data set into a training data set and a test data set. And we kind of develop the model on the training data set. And we run the, we also run it as in prediction mode on the training data set and we observe what the prediction errors are. Then we run the prediction afresh on the test data set. And again, we observe what the errors are. And we make sure that the errors are small and similar in both the cases, right? So if you do all this, you can be reasonably sure that the analysis is on firm ground, right, that it is now worth interpreting. So after, when you do all this, you might sometimes find that, you know, if the, if the statistical significance itself is not there, the p values itself are not great. Then of course, then it kind of throws the whole analysis into question. But sometimes, you know, because we are using our expertise, we are not just trying to correlate random variables, right? We know that these variables influence CLT. So it might often happen that you get good scores for statistical significance. But your model might have poor explanatory power as in the adjusted R square values might be on the, you know, might not be great. So something like 85.85 is considered good, right? But you might have much lower than that. And that if that happens, that indicates that maybe there are other factors that influence CLT, which we have not taken into account. So a typical example of these other factors is wait time. So in most organizations, the way you set up your Gira or Azure DevOps does not allow you to measure wait time, does not allow you to figure out what the wait time was. And therefore, it's unless you make changes to your workflows, you know, wait time is sometimes not available. And although it might be a significant factor that influences the CLT. Another thing is, you know, so far we are assuming all features are more or less the same except for their size and, you know, number of bugs and so on. But it could be that the different domains, features in one domain take much longer to release than features in another domain, right? That might be the case. We are factoring, you know, the development time, test time spent on a feature. But all developers and testers are not the same, right? You might have inexperienced developers and experienced developers and testers and so on skilled developers, less skilled developers and so on. So we are not really factoring their competence into this model because the data again, the data is not readily available. But if your result has, if you come up with a poor result with poor explanatory power, then you might want to consider rerunning the analysis after incorporating these additional variables into the model. So let's say we did all that, we got a meaningful result and let's say the, you know, the usually multiple regression will throw up, multiple linear regression will throw up a result like this where it will say that, you know, CLT is actually a missed intercept here. So there will be some constant plus a few variables, minus a few variables, right? So plus means like as feature size goes up, change lead time goes up. As bundle size goes up, change lead time goes up and so on. And minus means as the number of testers increases or the testing effort increases, CLT goes down. Or as the number of developers increase, actually, I've made a mistake in this thing. This, you know, is not test days. If it's test days, it's a plus, but it should actually be a number of testers. So, you know, so I think of this as a number of tests. If you have more testers, then the CLT will go down, right? So that is roughly how you interpret the positive and negative signs. And then the coefficients, because you've normalized the data ranges to begin with, you, the coefficients are now comparable. And what this equation says is that for the data ranges in their analysis, the highest coefficients, numerically highest, ignore the sign, but the highest coefficients have the greatest influence on the CLT, right? So in this case, the three greatest influencers are bundle size, number of developers, and feature size. Yeah, because that has a, you know, 8.6, 5.5 and 4.6. So that is what this indicates. So what it means is that if you want to reduce CLT, maybe you should focus on these three factors, because these three factors have the highest coefficients, right? So they are, they have the greatest influence on CLT. Now, in this particular case, you know, it's the first is bundle size, second is number of developers, third is feature size, right? Out of this, not everything, increasing number of developers is potentially a management decision or a staffing decision, right? Whereas introducing practices that will reduce feature size or that will reduce bundle size is not necessarily a management decision. You know, the principal engineers on the team or, you know, senior technical people can take this call and try to do something about it. And therefore, this gives us the answer. If you see, basically if it says, okay, the most in our control, what is in our control is say bundle size and feature size, then you can go back to this map and say, okay, so which practice influences bundle size, okay, that's independent deployment, right? And which factor influences feature size, okay, there is, we can do something like feature slicing, right? And that is how we come to know that, you know, for the data and the investigation, right, that, you know, if it belongs to a particular team or a particular portfolio, then we can say that that team or that portfolio will most likely get the greatest benefit from these sort of interventions, right? So that is, that is how, you know, the statistical analysis is helping up, come up with the answer of what we should focus on in the future. But it is a, in a way, it's a prediction. It's a prediction based on the past data, of course, you've run the model on the past data. So it's telling at least that is what the past is telling you, right? But we can verify that once we adopt this recommendation, and we actually, you know, introduce these practices into the teams. Then if those practices are really making an effect, then they should have a trickle up effect on the top of the tree, right? So if this is having an effect, then everything else constant integration times should go down. Similarly, if this is having an effect and everything else constant, wait time should go down, right? So again, after a few months, we can, we can see, did test integration, test time, wait time, decrease for comparable features, right? When I say comparable features, just like what Narish said earlier, you need some sort of a normalization activity to make them comparable. But once you do that, we can, we can answer this question, did it reduce? And similarly, if they reduced, then how much as a result of them reducing, what was the effect at this level, right? How much did that help improve CLT? That is how we verify, verify, you know, the result of our actions. And our actions themselves were based on the result of the regression analysis. So what we've seen essentially is a data informed transformation loop. Like you might have come across, build, measure, learn as the loop to build products, right? Like you build something, you measure its effect with users or in the market and then you learn and then that informs your next round of building functionality, right? Now, when you're talking about transformation initiatives, you're not building something in a transformation. But what you're doing is you're designing interventions. You're saying maybe we should adopt this practice, maybe we should use this technique and so on. And they are interventions, but you could use the same, you know, data informed approach to transformation where it's like you intervene a little, you measure the results of the intervention and then you learn about that. And that helps you decide your next round of interventions, right? And ideally, before you begin this whole process, you might want to baseline like in the metrics that you are interested in CLT, delivery lead time, you know, reliability and so on. You might want to baseline them and then start designing interventions and then keep executing iterations, you know, maybe quarterly iterations of this loop or yeah, because transformation efforts typically they, you know, weekly iterations may be unrealistic, but maybe quarterly iterations, six monthly iterations that kind of because you need time for the data to accumulate and then make inferences from that data. So this was the sample analysis. Now we'll get into the actual analysis, you know, what we actually did for a client and for that again I'll, oh okay. Yeah, for the next few slides I'm going to ask Nareesh to come in and talk about, we did not begin with statistics, we began with the Excel based analysis and I'll let Nareesh speak about the first part of this. Cool, thanks Radham. Just before we jump in, I see there is one question, we'll probably just quickly make sure that we've answered that before we move ahead so there is a question from Shravanan. He's asking why we logically know the CLT contributing factors, how do we know the identified contributors is really contributing to CLT or not. I think that's kind of maybe, you know, you might have asked this slightly before, and that's kind of what Shri Ram actually went through and explained. So hopefully Shravanan that is actually covered. If not, please let us know and we will maybe circle back. Alright, so just wanted to make sure that we've addressed that piece. Yes, okay, cool. Please confirm that it is so cool. Unfortunately we can't see you, but we can still get that feedback. Alright, perfect. So, you know, in the frugal innovation thought process, right, like you want to start with the simplest possible thing and of course a lot of people will look down saying, gee, you're using Excel or whatever but I think it's actually a pretty powerful tool and it can give you a lot of insights and you can actually create on it quite a lot before you decide what you might want to deep dive in, right. So in our case we basically said, hey, we want to understand essentially the impact of CLT, you know, month on month, how is the CLT doing, and how does it compare to the feature velocity if you will, that is being completed. So what we started doing is we basically started plotting this data out in, you know, using simple Excel and we started looking at basically month on month what does our CLT look like and essentially, you know, how does that compare to the features and is there any correlation between the two and, you know, what are the trends looking like. Unfortunately, like you can see from the graph, there is no, at least we couldn't see a direct pattern. One of the thought process was basically as the feature count increases, CLT will also increase was kind of one of the assessment we are at least hypothesis that we had. But the data did not 100% concur to that and we had a lot of things that were not matching that. So we wanted to get a little bit more deeper and try and understand what is going on. So if we quickly move to the next slide. What we what we try to do is two things here one is we basically took for each month we instead of just looking at the CLT as a whole number we started breaking it down into what are the various components that contribute to that CLT right like basically the time the waiting time the SIT time etc right started breaking those and we wanted to see if there was any correlation between them. The other thing also we try to do is instead of looking at it as an absolute number we basically started looking at it from a percentage point of view so relatively we wanted to see if certain phase is actually leading to a larger contribution and what we quickly realized is when you're trying to do these month on month analysis, especially for something like this where your CLT itself is much longer than than any given month you will have the carry over effects and you will have other kinds of things where something would have a delayed effect showing at a later point in time and vice versa. And so you can't really come up with any clear conclusions basis this and so we decided that this month on month view of looking at data is actually not a very is not a right way to do this. And so if we move to the next slide we kind of then pivoted a little bit to instead of now you'll see here in the bottom. These are basically feature IDs so we are looking at for a given feature. What are all the basic contributors to see empty and essentially what is the, you know, like, and then we started laying over like a couple of, you know, influences that we thought for example feature size. Right, or things like bundle size. And what we did see in some cases the orange is the basically SIT. You know, the time it takes in testing right in SIT, or the yellow one that you will see is actually the time it takes for testing in replica environment or you know staging environment. What you could see is in a few cases at least the, if the size of the feature was big, then essentially the time, the testing time, either the SIT time or the replica time put together that was was a big number, but that was a big portion of the So that this sounded very promising to us we said okay this is great right and what we tried to do is we said hey but we also know that just feature size is not the only contributor. There are other contributors so let's start overlaying all those contributors on to this and see if we can easily find some patterns right and hooray you know then we have the answer. Unfortunately when we started doing that things where things became very fuzzy things were no longer as simplistic as saying okay if the feature size is big then the testing time is more. And as experts you can obviously you know relate to that and say yes this makes sense so our data is in fact, you know adhering to our mental model of things, but as you started playing more data, more influences. Those correlations didn't really hold up right and it started becoming too complicated for us to, you know, drive this, and this is the point where I think, again, Sri Ram me and Rakesh said, well, you know, we have outlived the what we could do with Excel and at this point we need to turn to a little bit more statistical tools like R or things like that and again I'll kind of quickly pass it back to Sri Ram to, you know, narrate the story from here what happened. Yeah, thank you. So the next iteration, you know, we started with with statistical analysis so we basically had relatively speaking good quality data for these three variables right we did not they we did not directly have feature size in points or whatever, but we had a proxy metric for that, and then we had a bug count and we had bundle size. So we said like, you know, CLT is a function of these three variables, let's let's do analysis and see how it holds up right. So we split the data into three portfolios because like I said, it's a large scale setup and they were different lines of businesses so here every portfolio corresponds to one line of business. So when we split it up, then you know, in out of in two out of three portfolios, we found that the regression result could explain about 60% of the variation in CLT, which in other words, the adjusted R square was about 0.6. And that's, you know, that's not great, but that's not disappointing either, right, it just means that there are still some more some more variables that influence CLT and so we need that for our next iteration. We want we want to have data on the actual development days and test days. And for that, we are we are in talks with their with the team that manages their processes and tools to introduce to use the capacity management module in Azure DevOps, and also maybe do some lightweight time logging of the actual time it took on on different features, right. I know that's not great but here is the sort of ideally we want to minimize manual data entry, right, but but on the other hand if you want to have sustained budget for your transformation efforts, then at some point you're going to have to show the show the demonstrate results. And if you want to demonstrate results in a, you know, in a somewhat rigorous manner, then you have to do all this and for this you need the data and where will the data come from. Sometimes you can just through through the actions of people, the data is generatable but other times you have to ask for a little bit of data entry discipline, right. So that's what, you know, we are going to do as the next iteration but even so far in whatever we have done so far. Then we've faced quite a few challenges in a lessons learned in the areas of data extraction data availability and data quality. And jointly, you know, I've been thinking and writing about this topic in other contexts as well. So, jointly all these challenges, they're all in some ways they're all measurement challenges the ability to, you know, measure things, and they contribute to measurement debt, which is similar to technical debt or tech debt if you've heard of it, right. So, you know, the tech debt slows things down, right, and it basically reduces the rate of change, and it makes code less maintainable, and so on it has all those negative effects, right. Similarly measurement debt has negative effects measurement that means you can't measure the result of what you're doing, and therefore you can't learn from it, right. But, but it is very common as this is probably even more common than tech debt in most organizations. And so to, if I want to define it a little bit formally, I would say that no organization takes on measurement debt. When it implements initiatives, any kind of initiative change initiative or you know, a new product or a new set of features for a product whatever it is. It's an investment, it represents an investment. So they're investing in the initiative, but they are not investing in the measurement infrastructure that is required, in order to validate the benefits to be delivered, you know, that will be delivered by those initiatives, right. So, if this happens, then you're taking on measurement debt. In our case, in our case, you know, what is the initiative that we're talking about, it's basically the, I'll maybe give you a second to think about this, right. We're talking about like if you invest in some kind of initiative, whether it's a technology initiative, or in this context, it's a transformation initiative, right, you're still investing in it. You may be hiring some coaches, I know, maybe you're, you know, investing in some tooling and so on, they represent investments. So investing in a transformation initiative, but if you don't have the corresponding measurement infrastructure to validate, if it is making a difference, then you're basically shooting blind. And that is the state in, in many organizations, a transformation is a article of faith. We say, oh, we are doing stand-ups, we are doing CICD, we are doing this, right, but we don't know if it's really making a difference. And so it's because we don't have the rigorous measurement practices in place. So measurement debt breaks these loops, right, earlier we talked about the intervene, measure, learn loop. And if that loop is active, it will be great, it will accelerate learning, it will give you, you know, hopefully your transformation will be more fruitful. But if you have measurement debt, you can't measure things, so it kind of breaks this loop. And so you're basically doing one thing after the other in the hope that it is going to make a difference, right, without any means to verify it. So now, you know, that was the little bit of, you know, reflecting on the whole process. Now I want to talk about the specific challenges. Data extraction is something that narration, Rakesh spent a lot of time in and without that effort, none of this would have been possible. But they are most closely familiar with this. So I'll again request an address to talk about this. Cool. I mean, data extraction sounds awesome. You know, with the data extraction comes its own set of challenges, right, the very first one that one can imagine if you're trying to pull all this kind of like lots of these different kinds of data. Unfortunately, most tools don't give you one ready query or API that you can just call and get all of this data. You'd have to be making in our case we were making I think now we are up to about, you know, 10,000, you know, we were about 10,000. Now I think we're close to 100,000 API calls we are making and trying to stitch this data together. And it's not as simple as just stitching the data together. If you click on the next thing, we also in many cases have to perform complex data transformation steps to then aggregate this data and reshape the data to be able to present that. And once we have that, you know, you might think okay you've got it but you know only to realize that you have a very low signal to noise ratio and you need to discard a lot of data and only pick a few important parameters out of the GBs of data that you pull through these API calls. And if you go next, yeah, one of the other challenges that we ran in is that across projects, because no one was originally analyzing this data from this lens, different teams ended up doing things differently, both in terms of their workflows in terms of custom fields that they were using and so forth. And so we had to really dig in and pull out this data and then have a layer of logic on top of it to interpret this data, you know, specific to each project so a lot of you can imagine configuration, sitting saying, you know, for this team what data to be considered as end of, you know, deployment to a certain stage or so forth. And one other thing as this was happening is because I think Sri Ram also explained that you start, you know, influencing how this data is you start introducing certain invention so that intervention so that data itself keeps evolving. And now you have to have custom logic in your extraction, which has to be sensitive to what time period that this data belong to and appropriately massage the data so that you can then make sense out of it. So, again, there were lots of other things but I think these were the few things that top of my mind that I think we had to do deal with in terms of, you know, improving the ability to itself extract the data and get it in the form where we could do analysis on top of it. Thank you, Sri Ram. Thanks. Thanks, Narish. Then moving on to data availability, right, again, like people do capacity planning and you know, from capacity planning you can figure out what is the expected number of developer days and or tester days to be spent on a feature. But that might be different from the actual number of days spent, right, so that is where you know, you'll have to come up with some additional mechanisms to obtain that kind of data. The other one is time spent in various queues we talked about this right most workflows don't model for wait times. And therefore, few if you want to start getting this data then you have to introduce those waiting states into those workflows. And in teams where people multi task on features there, you know, it's it's hard to plan and it's harder to understand what really happened actually right. So that's where a bit of time logging might help like if you see in this table we are saying on on on day number six developers spend 0.5 days on on, you know, on a task or maybe two developers spent a quarter day each on that particular. Story right whatever it is, but you know you might need some of those data collection of that nature to to to get to enough enough data that you can start making interpreting it meaningfully with statistics. There are also quality challenges, where for example, some of the things we were reliant on, when did this state change, right, like if you're calculating CLT, like you know, the start of CLT is dev complete to go live. And there are some other states in between so sometimes we found that the state change dates are missing. And how are they missing? How can they miss? If you have a workflow, then you should not be missing those state change dates, right? Then we realize there is an anti pattern that they are not using the dates based on the state transitions. Instead, they have a whole bunch of custom date fields, which people have to populate. And if they forget to populate or they omit to populate, then we will have this kind of problems. In other cases, we found like bugs are not closed after fixing, or in some cases we were initially puzzled by this that we found that for some bugs, the close date is earlier than the creation date. And then we realize it's because people are cloning the bugs in when they create a new bug report, they are cloning the earlier bug report, earlier bug, and just changing the description. Now, if you clone it like that, and if you are using custom fields for your dates, then the close date also gets cloned, right? And therefore, now the creation date of the new bug is later than the close date. So all these kinds of things we slowly, we had to figure out why when we were doing these kind of things. And this is something that I already referred to that in some cases you need to have some kind of data entry discipline, even though yes, as you know, we've all been, at least narration, I have been developers in the past and we know that developers don't like to do any kind of manual data entry. But that is where if you explain the context and say, you know, in order to continue these efforts, we need budget. In order to get budget, we have to make the case that this is actually resulting in a benefit, right? Otherwise, if you don't have the budget for all these things, then basically, you know, the delivery pressures are still going to be there. And we will just have to, you know, soak up all that pressure and we will not be able to invest in all these interventions, right? So if you explain it like that, it's part of a change management process. Then I think you can get more buying for the data entry discipline. So in a way, it's not necessarily a bad thing. You can look at it as an additional benefit that along the way you are improving data quality and the way work is getting tracked, right? So instead of this is the textbook loop where you just say intervene, measure, learn. But in practice, what we do is we try to measure, then we find that a whole set of challenges. So we either improve the processes or the tooling in order to improve the data quality and then we are now in a position to measure. We learn from that and design our next set of interventions and, you know, that becomes the transformation loop in practice. That's pretty much what we wanted to cover to quickly, you know, summarize our key points in conclusion. Measurement is necessary without measurement, transformation efforts might lose credibility, right? You might get investments for a year or two and after that you might not get any further investments. And, you know, even otherwise measurement is essentially in order to execute if you want to do data-informed transformation loops, intervene, measure, learn kind of loops. But even with measurements in place, it's not straightforward. It's not like A, then B kind of inference, right? So it's not straightforward to demonstrate the impact of our interventions and that is where statistical methods can help. And with the right statistical methods, we can answer what factors are most influential to the metrics that matter. Then we can choose to focus on the interventions that improve those set of factors, right? And if you have sufficient data, this does not have to be done just at a complete organization level. If you have enough data, you can do it on a per team basis or at least on a per portfolio or per line of business basis, right? And when you do this, when we do all of this, you will usually uncover gaps in data quality and availability. And these gaps may be addressed through a continuous improvement of the processes, the workflows and the tooling. That brings us to the end of everything that we wanted to share with you on this topic. Welcome your comments and questions. Cool. Two minutes before time. That's pretty good. I've been trying to address questions along the way, so I have typed out a bunch of questions and responded already. If there are any other questions, happy to take, but I hopefully have answered most of the questions along the way through chat. If there's anything else, let us know. I think there's some confusion around cycle time and lead time. I think I clarified that. I see Pradeep saying that he likes the last chart where we're trying to say, we're saying trying to measure, you know, other than that. I think we've addressed most of these things. Yeah, our objective again, one of the questions was, you know, are we going to share specific insights in terms of the interventions that we did. The objective was not to specifically talk about, you know, this particular team's intervention we did, because that's going to be different for different teams and different organizations. Our objective here was a bit more meta level in some sense and trying to talk about the approach that you might want to take in this context. I see one question is popped in. Tom, thanks for asking the question. Tom is actually the guy who invented software metrics, so it's great to have him here. And I think again, Tom is saying, what about other critical measures like security and usability? Sriram, you want to take a stab at this? What about other measures like security and usability? Well, I guess you could use the same process, right? Like, what is your measure of, at a top level, for example, right? What is the measure of security, right? Somebody might say, you know, based on number of incidents, right, and you might have some sort of weighted score for your incidents, right? And you come up with a, use some kind of weighting logic and say, okay, in the last quarter, our security score was this much, right? So you have a measure at the top and then you try to figure out the contributing factors. What are the, you know, like build a contribution tree like we did for security, right? And figure out what are the low level interventions that ultimately kind of bubble up and make a difference at the top, right? And so basically you can, I think the same kind of method can be applied for other. We just need to figure out what is the metric that matters and how it breaks down into a contribution tree. And we need to have the data to do the regression analysis. Yeah, also, I don't think we are saying that, you know, these things can be, you know, just bolted on at the end. Of course, our objective is to try and build this in, into the whole thought process and improve it. But given you are starting with an organization at a certain point in time, what are the kinds of interventions that you would want to introduce so that eventually these things, let it be security, let it be usability, other critical aspects that influence the product itself is actually baked in, it's weaved in or built in to what people are doing. You know, it could be right from, I think, Tom, you've talked a lot about just improving the quality of, verifying the quality of the requirement itself, right? Is the requirement itself of good quality or not? And so, you know, certainly some of those thought process can be built in, you know, but where do you start with and where do you try to move the needle is, I guess, how we were approaching this saying, you know, how do you build, identify the metric, then build the contribution tree, and then slowly introduce the interventions and keep measuring whether what you're doing is helping you move in the right direction or not. Because all of these things are not going to be overnight change in any organization they will, you know, and every, every intervention will also have side effects. And if we are not measuring holistically, you may be driving off the cliff saying you're going really fast, right, but you might be just driving off the cliff or in the wrong direction. So I think the point here is that how do you establish this kind of a thought process where you can use this continuous learning that comes from measuring the data and being data informed, at least if not data driven. Yes. I hope that answers your question, Tom. I know we're out of time, but we happy to pop into the hangout area and, you know, happy to answer more questions and maybe even show some of the other stuff that we've been doing, if anyone's interested.