 So, I've given this talk, not this exact talk, but a version of this in a quality discussion with customers. And we talk about how things have changed in the cloud cadence and the combined engineering and shift lap. And at the end of the presentation, they say, wow, that seems pretty drastic. I don't know if you're ready to do that. And the reality is that our testing approach at Microsoft has been evolving for quite some time, I would say, for more than a decade. So before I go into what happened during the cloud cadence, I'll just give you a quick history of what happened before that. Because the customers that you might talk to, they may be at different phases in this transformation. And I won't spend too much time in the history, but hopefully that will give you a sense of how to kind of guide the customers through this in case they are somewhere prior to where the cloud cadence happened in VSTS. So I'm going to take you back to 90s. So as long as Microsoft's a product, we always have three distinct disciplines in a product team. PM, Dev, and Test. PMs gather customer requirements, road specs, Dev, road code, and design code, and test, road test. Roughly, we'd have a one is to one is to 1.5 ratio. It kind of varied by teams, but that's the general ratio we used. Now within test, we have two distinct disciplines. Now many people may not know this, but this was a unique setup at Microsoft where within test, we would have software design engineering test, these are the SDETs, who developed the automation, the test infrastructure, et cetera. And then software test engineer, or the STEs, who ran the automation or who ran manual test. And this is a key point. The software design engineering test were hired, you know, with a very similar qualification as the software design engineers or the developers. You went to the same colleges and if you hired from industry, you would pretty much hire developers and then convert them into SDETs. Now remember this point when I come and talk about the combined engineering because this is important in particularly the way the test discipline was set up at Microsoft. So how did it work? Well, it worked reasonably well back in the days. We achieved a commercial success with big products like Windows and Office. One of the benefits of this model was that when we are ready to do a product sign-off, you will have the quality discipline or the test discipline bringing a very formal sign-off criteria and formal measurements in quality. And so that gave us a pretty good confidence in declaring a product ready to release. It also developed deep expertise in testing because the test discipline was solely focused on testing. They were thinking about this day in, day out. So that was the great thing. But did it really work though? And the answer is no, it did not work. There were problems. The problems were simply masked by the fact that, A, we had commercial success of our big products and, B, there was a long product cycle. But there were numerous problems. So the developers just threw the code over the wall to the testers, SDETs. The SDETs wrote automation and then threw that over the wall to the STEs or the software test engineers. So these STEs, the way they responded is by just keep adding more and more STEs, particularly the vendors. There was really no growth opportunity for the STEs because they really didn't have any upward mobility. They couldn't go anywhere. It was very expensive to maintain this setup. And testing became bottleneck and caused product delays. But again, we couldn't see some of, we didn't feel as much because our product cycle was long. We shipped Windows every two years or three years. So this sort of worked. By around 2000, late 90s, it became very clear in the company that this wasn't working and we had to change something. So a company by decision was made to get rid of the STEs. From the test discipline, we had SDETs and STEs, no more STEs, they are gone. And we did that. It was actually very painful because those STEs, remember the STEs didn't have the same qualification as the SDETs. And so a lot of them, we tried to find them new roles in the companies. Some of them did, but many of them didn't. So this sort of improved the model a little bit in the sense that now you have SDETs who are responsible for not only writing automation, but operating the automation. So they own the whole thing. So they were naturally incentivized to A, write good automation, B, write more automation instead of just throwing a test over to another team to take care of just running the test, they were now responsible for it all. But the core problems still remain. The developers would throw the code over to all of the SDETs as they were constantly trying to catch up. So we got clever and we said, you know what? We're going to introduce a thing called quality milestone or an MQ. And this is a milestone we would introduce or we'll have after the product is released and before the next product is about to start. We'll block off a certain period of time and say, whatever the quality debt or test that we accumulated in the previous release, we'll just catch up and fix it there. The clever idea, but it didn't work in practice for just a couple of reasons. One is that now people knew that there was a milestone coming up called quality. So they would just defer the quality work to that milestone. And the other issue is that you have a milestone dedicated to quality. So people would conserve all kinds of quality initiatives and things that they think is creative inside the quality realm and try to schedule that work, cost priority inversions. And sometimes that work didn't get done and we just accumulated more debt. So a clever idea that didn't really work. So the test was still a bottleneck, but we again, we survived because we are in this waterfall world. Then came the cloud cadence. So the arrival of the cloud cadence around 2008 timeframe 2010. And it brought new pressure on the system. Now there is expectation that we are running a much faster cycle. And the expectations just continue to increase, faster, faster, faster. Gone are those long stabilization phases. We don't have the opportunity to create a beta and give it to customers, do dock food. So those kind of validation phases are gone. Which are crutches in the past, but they are now gone. You're living in the world of microservices. These microservices are deployed independently. So there is pretty significant complexity in terms of getting those services right, the quality right on an independent cadence. We talked about how we had to support no downtime deployments. The services need to stay up all the time. And so what did we do? Well, we knew how to ship software for last 25 years. So we said, well, just use the same approach. Just try to do it faster. We went from two-year cycle to six-month cycle to three-week cycle. Just try to figure out how to do whatever we knew. Just do it faster. So our initial approach was, same model, run faster. We pushed for getting automation, more streamlined. And we got clever again. And we said, oh, one of the ways we can deal with this is that we don't need to run all the tests. Guess what? We can be very smart about which test to run. We'll pick some test here and pick some test there. And that's how we'll survive. But it was just a matter of survival. It became very clear to us that the model wasn't working. And so we started seeing all kinds of issues. Testing was a major bottleneck by this time. Particularly in VSTS, I think Bill or somebody mentioned that we had some sprints where we would do a three-week sprint cycle. We finished that. Then we go through another three-week of stabilization. By the time a sprint got deployed, it would take another three weeks. And as that started running around, trying to stabilize the system and deploy it, the mean time, the work on the next sprint is already completed. And so they're just trying to catch up. And then the cycle would continue. Same issues, lack of accountability on the devs. The short version is that we recognized that this model wasn't working. And in fact, we were not the first one to recognize it. There were services before us, like Bing was one of the major services at Microsoft. That saw this. And we started seeing, observing this based on the practices some of the companies born in the cloud, they were following in the industry. So we knew that we needed a new model in the Cloud Cadence. So that's where we get to this point. My rest of the talk is about what happened in the Cloud Cadence. I just wanted to give you a flavor of what happened before that because you might run into customers who probably still have the world where you have STEs who are running manual tests. And there is not a lot of emphasis on automation. So you have to bring them along on the journey before you talk about some of the other stuff. I'll walk you through it. So what happened? What happened in the Cloud Cadence? This is pretty much sums up the three big things that we changed. We changed the quality ownership. We fixed the quality accountability. So that's number one. The second thing is that we understood that in order to ship frequently out of a release branch, you need to have a master that is also in a pretty good shape. It's in always a shipable state. You saw Bill talk about how you work in the master and then the release where we release. The quality is not just about getting the release branch right. It's actually quality starts in the master branch and keeping it in a shipable state. Now that is a statement about a lot of things, sort of the code flow and how the branch mechanic stuff that Bill talked about. But from testing perspective, we focused on two things. One is this concept of ship left, ship left testing. And I'll talk about that in a second. And then the second thing was getting rid of all the test flakiness in the system. The other thing that we understood is that there is no place like production. This is sort of, I would call this the ship right part of the strategy. So on ship left, run test close to the code, run more unit test. To me, ship right is run test close to production because there is no place like production. And it's a set of practices about sort of both safeguarding the production as well as ensuring quality in production. So in a sense, we got rid of the testing that was happening in the middle, sort of the integration style testing, functional testing that used to happen in the lab. That was the big departure here. All right, so I'm going to walk through each of these concepts in a little bit more detail. Quality ownership. So we did combine engineering. You heard this term before. We've talked about this. Combined engineering in a nutshell is those two disciplines, dev and test, two roles, taking those two roles and merging them and putting it on a single discipline, single role, called an engineer. So we got rid of the two SD and SDAT roles, just one role, engineer. The key thing is that when we did this, that there is. So first of all, that individual has a combined responsibility for both dev and test. And so it's not just an organizational change where you bring the dev and test them together. It's an actual discipline merge. If you think about the set of qualification or requirements of SDE and set of qualification requirements for SDATs, you merge them into a single set, that's what this was. And so everyone had to learn new skills. A lot of times when I talk about this, the first question I get is, so what happened to those SDATs? Did they learn how to write code? Well, the reality is that, remember the qualification I mentioned earlier, they knew how to write code. They got a little bit rusty in terms of their design skills. But this was also a learning for the developers. Because developers now have to learn how to write test, write automation, run test, do manual testing, do exploratory testing, things like that. So this required learning on both sides. And I think that's a key point when a lot of times people talk about combined engineering, they say, oh, OK, that means I need to train my testers to be more like devs. No, no, no, it actually goes both ways. The other key concept here was that the idea behind this is that you want to reduce handoffs. There is no, in a short cadence, you don't have the opportunity to start somewhere, write your code, give it to another team to test, and then give it to maybe another team to do performance testing, and give it to another team to do deployments. The basic idea was that we wanted to reduce handoffs in the team, and give an end to an accountability to a feature team, to an engineer inside a feature team. So this was a big cultural shift across the company. This change happened in one team, but then over a few years, every team across Microsoft changed. Now, different divisions took a slightly different approach to doing this. In some cases, like us in VSTS, when we did combined engineering, it was pure. Like, we just merged the two disciplines. There is no other team that is responsible for quality. Every feature team owns its own feature area, feature area quality. In some other orgs, they still left another small team. And to look after the live site telemetry or live site instrumentation, things like that. But ultimately, if you fast forward now, just about all teams at Microsoft follow this model. How did we make this transition? I think this is an important thing to talk about, because like I said, it's a pretty drastic change. Our first transition that I talked about where we got rid of the STE roles was very painful. And we learned from that a lot. So when we rolled out this change, particularly in VSTS, fortunately for us, there were a couple of other teams that had done this at Microsoft. So we went and talked to them. We learned from them. We had a lot of discussions in the org, kind of getting the team ready to do this. One of the things that we were very concerned about was that there is all these things that the test team does. Some of it, what I described at the time is dark matter. Like nobody understands what they do, but they do it. And somehow that's magic happens in the right quality, happens at the end. So we meticulously went and invented everything that the test team does. It was a spreadsheet, giant spreadsheet. I forgot how many rows, but there were rows of like it's not just like we run automation or we write automation. It was all the little things that the test team did to kind of keep track of quality in the org. And we made sure that all those responsibilities were reassigned to somebody in the org, basically to these new roles. That was very key. The second thing we were very clear about is that this is not just changing roles and responsibility. We are going to have to change the way we test, period. If we continue to test the way we were testing before, it's not going to work in this new world. So this is where I'll talk about the shift left testing, testing in production. Those concepts were not only internalized but practiced in the org. And we gave ourselves about 12 months to go through this transition. Now remember when we did this, six months later, we were shipping TFS 2015. So the litmus test was getting the quality right for TFS 2015. So we said we'll give ourselves about 12 months. It means during that time, the SDEs and SDETs will start off kind of basically in their old roles but slowly evolve into doing the combined responsibility. So you may have a feature team within that. The people who were former SDETs, they continue to do more of the SDETs work and the former SDEs continue to do more of the SDE work. But sprint by sprint, the ratio kept changing and eventually after six months, you cannot recognize a dev from the test in the org. So that's kind of how we managed it. Now at the end of the transition, there were people, some people who didn't quite make the transition and that was the sad reality. But we supported the transition through training, through sort of just the development of the new skills, letting people practice, practice sprint after sprint after sprint. And so kind of just giving yourself a more practical time frame to go do this is key. By the way, feel free to ask me questions, otherwise this will, yeah, go ahead. So in this new model where everyone on the team is an engineer and how do you take the responsibilities that were previously spread across the quality organization and delineate responsibilities on a team where theoretically everyone has the same skillset or responsibilities. I'm wondering how the division of labor actually occurs. Because right now on the team I'm on, for instance, our test automation occurs with different engineers and they're automating the test and they're writing code, but they're not writing features. And it's on the same team, so theoretically we're doing this too, but it seems like it's different from what you're describing too. Yes, it is different and that's, I think it's important thing to clarify. In the beginning it looked like what you just said. So let's take a particular feature team. In the old setup we had five developers, maybe five testers, and that constituted a feature team. We bring them together under a single engineering manager. So now that engineering manager has 10 engineers working for them, responsible for the same area. Sprint one after combined engineering happened, it probably looked very similar to what you just described, that the tester was still spending most of the time developing tests, the developers were spending most of the time developing code and design, but the North Star was clear. The North Star was that an engineer who owns a feature owns it end to end. They can take a lot of help. They can get a lot of peer reviews of their test plans, of their design, of their telemetry. They can get, in fact, they were encouraged to get a lot of help in terms of peer reviews, but the expectation was that the next print, guess what, they will be the one writing test automation for the feature that they own. Maybe they start with a small feature that where they do that. And the same is true for the tester. The tester started picking up small features of the backlog and they said, we'll own these features end to end, all the way from design phase to deploying to production and then monitoring into production. So it started off like that and over time, we expected Devs to pick up more and more of the test responsibility and vice versa. We would flip roles at times also. And that's why what I mean by allowing the team that 12 months duration to sort of transition into this new world. Could you speak to a little bit about what happened to like your team's velocity, in particular like the development velocity, them producing some business value. Did that suffer during this transition period and you guys feel like you're back to where it was? Yeah, so on the velocity, I don't know if Aaron showed you a chart that showed sort of our feature. One of the ways we measure our velocity was just number of features delivered in every year on average per sprint. And if you look at that chart, it's constant. It's been constantly going up since 2012. I believe you've been tracking and now 2017. So the short answer to your question is no, the feature velocity did not drop. Because remember, you still have the same number of engineers in the feature team. You just took the two separate teams, you put it together. Yes, you're spending a little bit more time in terms of learning and development and training sort of new skills. But there was also an efficiency gate through this process. And that is the key efficiency gain is that you're not handing things off to another team. When you hand things off to another team, guess what happens? This is context switch. This is like one thread waiting for the other thread to complete. And then it has to pick up the context and kind of run again. That constant back and forth that is to happen between Dev and Test, that's gone. And so you gain quite a lot. So in the long run, you absolutely gain velocity. You absolutely gain more capacity. In the short run, you could argue that, hey, there is a some period of training and learning. But it's a good investment in just building out, rounding out the skills. And you'll see, as I talk about in a second, the change was pretty profound across the org. It's not just my feature team. We got rid of, in fact, I can talk about that now, we got rid of basically this notion of specialization. There are no handoffs. You don't take a feature, you write it, you design it, you give it to another person to test it, then you give it to another person to deploy it. Maybe there's another person like Bill talked about who's a branch mechanic whose job is to push code around. Maybe there's another person whose job is to make sure the product is ready, and it's got the right performance metrics. All this different, there's another person who's testing the deployment, configuration testing, things like that. We took the core principle that there are no central teams, there are no specialized teams that do certain tasks. That was the core principle. But at the same time, we understood the importance of specialization. Specialization is important. Creating a central team where you hand things off to in a fast cadence is a problem. So we didn't want to lose specialization. So we did form a bunch of V teams. And I deliberately called them V teams because these are not dedicated teams. Right now in Brandsorg, there is only one team we could call that a dedicated team that does sort of the EPS team or the team that runs our central engineering system. It's a small feature team. But again, they do core engineering work for the, they're contributing to the engineering system. Nobody's handing things off to that team. So sort of V teams we formed, one of them was test architecture V team. Now this was a new thing. We didn't have this before. We had an architecture V team that looked after the product architecture. We didn't have anybody looking after the test architecture. And remember, we learned that, we knew that we had to change the way we test, which means we had to rebuild our test infrastructure. We had to rebuild the way we authored test from the ground up. So we picked our senior most engineer. In fact, Bill is a partner, IC in the org, senior most IC. He said, you're gonna lead this team. And he had a set of other engineers from across the org, part of the V team. And this team's job was to, like I said, not only build the next architecture for test, but champion set of practices that we were talking about. Yes, question. Can you please explain the concept of V team? What is it, what does it mean? V team means virtual team. So these are members from different parts of the organization. It's not a dedicated team reporting to a single manager. That's what I mean. We had test, sorry, Tenet champs V team. So what are Tenet champs? So these are people who are looking after, they are subject matter experts who are looking after some specialized activity that we do in test, whether it's making sure the product is accessible, whether it's making sure product has good performance and reliability, it's global ready, things like that. This used to be, again, largely be done by the test team in the past. In the new world, we refactored these responsibilities. Again, every feature team is responsible for making sure that their feature is accessible, is performant, is global ready. But we would have a V team of experts from throughout the organizations whose job is to build deep expertise in this type of activities, this type of work. So the subject matter expertise is still valued in the org. Specialization is still valued in the org. The main difference is that it's not sort of consolidated into a dedicated team. I mentioned performance V team. This is important because when you're looking at service performance, product performance, oftentimes you find bottlenecks in, let's say you are a one of the top level feature and you're doing performance testing for work item tracking and you find bottlenecks somewhere deeper in the system which is owned by somebody else. So we formed a performance V team's job was to identify common bottlenecks across the entire product and come up with the right design solutions and drive that. This kind of work, you cannot just form it out to individual feature team because the performance is an end-to-end problem. It's not isolated to a particular layer of the product. Couple of other V teams, V-Mod. Don't even ask me what the V-Mod stands for because right here at the moment, I may not be able to figure that out. But V-Mods are the people who look after our daily build health and the CI health and these guys are constantly watching the builds and the runs and if there are any failures in them, they do a quick triage and assign to the appropriate owners. So we formed a V-Mod team and you'll see that over time, the size of the V-Mod team also shrunk as well as what we expected V-Mods to do also shrunk as the system and the engineering system got better. Finally, we retained a small vendor V team that owned some really hard to automate type of test, like config test, TFS being deployed on on-prem different configuration environment. But again, over the last three years, this team has constantly shrunk because every year we ask the question, why do we need so many vendors who need to do this manual testing? Let's go automate that or let's figure out a different way of running those tests. So that's the end of sort of what happened in terms of changing the quality ownership and accountability. So quick question about, what does it mean, TENET Jumps V team? I don't understand the word TENET, I really didn't. Yeah, so TENET means so performance is a TENET, accessibility is a TENET. Or like an aspect. Yeah, aspect of the product, there you go, yeah. An attribute of the product, quality attribute. The other question was, I heard from Scott Guthrie in a presentation some time ago, the move to everything being done through the command line. Did that help in automating some of those aspects that the vendor team was having to do manually? No, so vendor team is actually, they are, today our vendor teams doing things like, we have TFS on-prem, it can be deployed on so many different configuration that we can automate that in theory. It requires a significant amount of investment. And so for things like that, where the cost of automating significantly more than kind of cost of just running it through. Thanks Dennis for the question. So it's a matter of trade-off and cost versus effectively. That's right. Initially it was a matter of survival because remember we came from a world where you have the team that is basically running test and writing test to a world where suddenly that responsibility is kind of distributed out to the org. And so initially just as we invented the whole list of things that the test team was doing, and we knew that there was a good chunk of test team doing this manual testing. And even then we had the vendor team. Even testing used to retain a vendor team that would run this kind of hard to automate test. We didn't want that to drop on the floor. The key criteria going through this was, we are shipping TFS 2015 in six months. It needs to be as good quality, if not better than it was in the previous model. Not only that, we are shipping and we will continue to ship to the cloud every three weeks. So in going through this, the key criteria was that nothing should fall on the floor. Nothing should go slip through the crack. So even if we were doing something that was not optimally designed or efficient, we just continued to run that in the new world until we figured out a way to do it better. Does that make sense? The initial thing was take whatever we have, just refactor, give it to different set of people, meaning give it to the feature team. But don't drop it on the floor. Even if it looks questionable. Like why are we running this test? It's not adding any value. Just keep running it for now until you figure out that there is a different way to do that. Yeah, question? So if I understand well, these are like virtual teams. Does this seem that these people have other assignments and if this is so, what's the ratio between their capacity in these assignments and some other things that they do? Yeah, so these are the same people that we have in the feature teams. And so let's take a specific example. Let's take accessibility. For accessibility we have subject matter export by location. So in Redmond we have two people who are accessibility expert. And we have another couple of people in North Carolina locations and maybe another couple of people in India. They are deep expert in accessibility techniques, philosophy, et cetera. But they are engineers inside a particular feature team. They just happen to have this secondary responsibility. Accessibility happens to be one of those tenets that requires quite a bit of, not work, but there's quite a bit of responsibility on that subject matter expert. So the person who's the accessibility champ probably spends half their time doing accessibility and half the time doing feature work. But the idea is that that responsibility also over time rotates. So it's not the same person every single sprint. We match stick to the same person for a given release. So TFS 2018, there's one person, maybe TFS 2019, will try to give that responsibility to somebody else. So nobody is doing this for life, if you will. And the number of experts vary by the tenet that we are talking about. Performance V team, I think it has about dozen people. Yeah, a question. So in this model, I see that, now individual is probably taking care of many things. Now one who did testing had to take care of so many tenets and things. Now as a developer, he has other responsibilities. So I imagine you manage that by maybe making smaller features and things like that. Did you have any challenges with end-to-end coverage? Because now my responsibility as a developer is even smaller. And how did you manage that end-to-end? Were there any gaps that were unveiled by these things? So if I understand your question correctly, so yes, on the surface it looks like your engineer is now doing twice the amount of work in a feature team because previously if I'm a dev, I have a tester who's paired up with me and he's doing half the part of the feature work or the testing work. Now I am responsible for the feature, but remember now if I'm the manager of the feature team, I have twice as many engineers as I had before. So the net capacity hasn't changed, net capacity is still the same. Or I give you more time, one of the two, right? Yeah.