 So, this is on VSTS architecture, I'm Buck Hodges, I'm Director of Engineering for Visual Studio Team Services, and we're going to talk about our journey from taking what was TFS and turning it into a service. And one of the things I always get when I talk to customers is, hey, should I go refactor my software, whatever it is, into a cloud service before I move to the cloud or should I go ahead and move? We did the latter. We went ahead and moved. And I'm going to talk about what that looked like. One of the things that I think we learned along the way is if we had decided to try to, quote, turn it into a cloud service without actually going and doing it and living it, we would have made a bunch of decisions that in the end wouldn't have been the right decisions. We made different and better decisions, I think, because we took the approach that we did. And then I'm going to talk about some of the things that we did as part of becoming a cloud service. Things like feature flags, being able to control exposure, resiliency, which I'll touch on circuit breakers and resource utilization or throttling and how that is important. And then finally containers, which is where we're headed. So let's dive in. So in the beginning, it was 2010 when we started this journey. It was, I forget exactly when we decided to go do it, but our target was the October PDC in 2010 when Brian got up on stage to demo what was then a prototype of a service of TFS running in Azure. So to set some context, 2010 is when we had just started to adopt Scrum. You all saw Aaron Bjork's presentation. This is going all the way back to August 2010. We had just started the Agile transformation. So we're at the beginning of this whole thing. So we had TFS, it's an on-prem product. It's a product we'd ship every two, three years. We had some good stuff there. The 2010 release, in part, set us up for being able to move to the cloud. It was the first release where you could have load balanced application tiers. It was also the first release that had the notion of a collection. And that became important because a collection becomes an account in VSTS. And another piece that I'll talk about is we had something, essentially the server framework. Now, it was all TFS. And part of what you're going to hear is we ripped a lot of things out of TFS to go reuse them. But we had some good parts to start with, but it was still an on-prem product. We had no experience with Azure whatsoever. It was brand new to us. Of course, it wasn't very old then in 2010 to begin with. All of our data was in SQL. All of you, of course, know TFS on-prem, shoves everything into SQL, file content, metadata. SQL's the answer for our storage. Can't be that way in the cloud, but that's where we started. We only have the notion of active directory identity. You know, you use NTLM, you use Kerberos to log in. There was no notion of a Microsoft account or AAD or any of the rest of this stuff. And of course, a notion of an account, not even close, right? There was nothing like that in the product. So this is where we started. And with that sort of starting point, we're going to move into Azure. And let's take a look at sort of what does that mean? So when we started, we had a single tenant system, right? What does that mean? It means every time you signed up for an account, that collection, as I mentioned before, is an account in the cloud. In the beginning, every collection was its own database. So every time somebody signed up in VSTS, we go create a new database. At one point, we had 11,000 databases in Azure North Central, right? We had a lot of DBs. And we were going to go broke if we kept it that way, but that's where we started. Upgrades are offline. Some of you maybe who've used the service forever may remember, because it took us a while to fix this, we'd have to take service down. So we'd schedule maintenance windows, which of course is completely awesome, because everybody loves maintenance windows. We had no telemetry. And in fact, I mentioned we were pushing for this proof of concept, which we did in October of 2010 at PDC. Our next big milestone was the build of conference in September of 2011. Compared to what we have now, we had no telemetry. Like we had kind of started, we thought we knew, hey, we should trace this, trace that, of course, we were super naive. But this was actually why I signed up for Twitter. Some of you follow me on Twitter. The whole reason I signed up on Twitter was so I could keep track of what was going on with product, because I figured if it broke, somebody complained on Twitter, right? That was telemetry. We had no deployment cadence. Our deployments were months apart. Ed Glass is going to talk about how we do deployments now, later today. But at this point, they were months apart and really random. Meaning there wasn't real thought into, hey, over the next year, when are we going to deploy? How does this work? No clue. We had no life-site culture. I mean, this is a team that was built, recruited. The whole culture was built around. We ship a box product. I give you the product. You run it, not me, right? So this is a big shift in how to think about engineering. And of course, you'll hear more from Tom Moore on life-site and that whole, everything that's involved in life-site. But it's also a cultural thing. And we didn't have that. And of course, nobody who joined our team ever thought, oh, I could be on call 24-7. Yeah, that's a big, big change. So let's flip now to architecture and talk a bit about how is this thing put together? So you heard about the starting point. In the beginning, there was only TFS, and there was only one instance. And it was only in Azure North Central, which is Chicago. That was it. It was a single global instance. We had everything there, verge control, work on tracking, build test, identity, account, of course, things like release management didn't even exist back then, code search, all this stuff didn't exist. But everything that we had was in one place. And what we did is we took, so the application tiers that you see there, that of course you're familiar with TFS, we put those in virtual machines. And we use what is now known as Paz, Platform as a Service VMs 1.0. Web roles for the application tiers, worker roles for the job agents. So it means when we build the product, we literally put together the packages that you need to deploy for Paz, and that's how we do deployments. As I mentioned earlier, we're gonna move to containers. It's still a work in progress. But this is where we started and honestly where we still are in a lot of ways. The content's really interesting. On-prem, we shoved everything into SQL. In the cloud, A, it wasn't possible because the SQL DBs, the maximum size limit was much smaller than it is now. I wanna say it was something like 100 gigabytes at the time. I might be wrong about that, but it was something like that. It was certainly not the 500 gigs to a terabyte that you can get now. But more than that, it was too expensive. I mean, per byte, SQL Azure is way more expensive than Azure Blob Storage. So we had to then farm out the content that's file attachments, the files you have in work item tracking, be it Git repos, be it TFPC, whatever it is. So we had to split all that out into Blob Storage. And you remember earlier, I mentioned with TFS, we had the notion of a framework. And I'm gonna talk about that here in a minute. But that allowed us to start to change how things worked on-prem and in the cloud, and still say sane with the same code base. So we'd put all the metadata in SQL. We'd put all the file content in Blob Storage. But it was still a single tenant system, as I mentioned. Every account, new database. We hadn't gone through that yet. We hadn't pulled out the notion of identity and account. It was literally a true monolith, in one instance. Totally global, every deployment was instantly live for everybody. If anything went wrong, it went wrong for everybody. Not an ideal place to be, to say the least. But we got it started, and this allowed us to start learning. So I'll talk a bit about multiple tenancy. This was one of the first things we had to go solve, or we wouldn't be here, that we'd be too expensive. So, how did we do that? We needed to have many accounts per database. And so in early 2012, we introduced literally a partition ID column in every table that has customer data. And that partition ID says which customer owns that data. And this is important because now we pack, we pack something on the order of 40,000 accounts in a given database today. Versus at the beginning when it was a single tenant. But there's a gotcha here, right? The worst thing is, if I get your data and I give it to him, I failed, right? And if I write his data into your account, I failed. So how do we protect against that? We came up with an interesting way to do that testing. And this is really cool. If you're trying to build a multi-tenant system in SQL, one of the few cases where we can take advantage of the fact that we have both on-prem TFS and VSTS in the cloud. Since the same code runs on-prem, SQL Enterprise has this notion of file groups. And we can assign file groups to the data. So we assign partition ID 0 to a file group that's always offline in testing. Which means if you run a query that doesn't specify the partition ID as part of the clause, the where clause, it's going to fail because the file group's offline. Which meant that we had a great system to go find problems. Because you probably remember, everything was in SQL, right? So all of our logic was in SQL. When we began TFS, SQL is where the data is. So we put a ton of logic into sprocks, stored procedures. And as a result, to really test the product, you've got to test it in SQL. So since we're doing so much logic there, and including now the multi-tenancy, you can imagine how many lines of SQL we had to go inspect to go find, okay, where all the places we need to add the partition ID clause. So having this approach to test it was incredibly valuable. And it kept us from some really, really bad experiences trying to untangle customer data. So, all right, so we fixed the multi-tenancy problem. But we still only have one single global instance. That's bad, bad for a whole bunch of reasons. Two reasons, one is every deployment is global. Two is I have no way to go build more services without adding to the monstrosity that is TFS. I don't want to grow that monolith. I want to be able to build services outside of that monolith. So we had to go split out what we call shared platform services. And this is account, it's identity, it's licensing, it's profile. It's a lot of sort of common services that every service in VSTS is going to need, it's going to need access to them. And we did this on the fly, in production. So we split this out in spring of 2013, while the system was live. And we're literally ripping, essentially the heart out of TFS, pulling it out into a separate service. We did it on a Saturday. If my memory's correct, it was March 23rd, 2013. We all got together that Saturday morning and of course, I wasn't writing the code, I had a bunch of great engineers who pulled this off. But we all get together, we take the system down and we spin up SPS for real, turn it all back on. And then we fix the issues that came out as a result. But again, we started with TFS. We're evolving our architecture in the cloud on the fly. And we're kind of doing heart surgery in the middle of a marathon. So the next thing we need is another instance so that not every deployment is global. And this is when we created TFS SU-0. It's our dog food scale unit. Those of you who've heard of the MSNJ account, that's where it lives. Our whole team lives in SU-0. Every deployment that happens, and Ed will tell you more about how we do deployments through rings. But every deployment we do starts in SU-0. We get every change that goes out. With the goal, of course, being if we screwed something up, we want to feel that pain. We want to fix it before it gets to you. Now, I'd love to say we catch every bug that way. Clearly, you know that we don't. But it's so valuable, it's saved us many, many times. And it's a fantastic practice. And we also have a luxury. We develop a product for engineers, for developers. We are developers. So the fact that we can use our own product is really nice. But every deployment goes out through SU-0 first, MSNJ. That's got, I forget how many people are using it. One stat that might give you some sense of scale. In our repo, which is VSTS, 430 people checked in to VSO over the last 30 days. So those are engineers pushing code. And of course, we've got program managers and so forth. And that's just one repo. We've got a number of other internal accounts that are in SU-0. But our whole principle here is it's only internal accounts. If we do something bad, and sometimes bad things happen, it only affects us. It doesn't affect our customers. So from then, from there, we started adding more. And I put SU-7 up here, which is Australia. But we now have 15 instances of TFS. The latest, of course. Great timing. Yesterday, Brian writes blog post. We've got one in Canada now. Yeah. All right. Good reaction there for me, too. So we're going to continue to add more. But right now, it's a service with the most instances, the most locations around the world. And that allows us to do deployment in rings. It's no longer one shot. Everybody gets it. We deploy to SU-0, then we deploy, and that's ring zero. Then we deploy to ring one, which currently is Brazil. Ring two, which is SU-3, which is down in South Central, et cetera. And Ed will really walk you through how that process works. But it's been critical to changing how we roll out changes. And by the way, that includes not only code changes, but also config changes, because we all know a bad config change can take you down, too. Most recently, we added a second instance of SPS. Until February of this year, SPS was purely global. And we finally added a second instance. We call it SU-0, because it's paired up with TFS SU-0 and the SU-0s for the other services. And that allows us, for the first time, to start deploying changes for SPS non-globally. We have a whole effort underway to be able to partition SPS itself into multiple instances around the world. That works ongoing. Hopefully, if I were talking to you a year from now, I'd be telling you how we did it and how it's turned out. But we're in the midst of that. So talking through that split, we got SPS pulled out. That then enabled us, and I'm going to fast forward in time quite a bit here, we have now 31 services. And they all kind of revolve mostly around SPS. Some, of course, have dependencies on TFS. Code search doesn't have anything to index if you don't have code and so forth. But we have all these services that can build out on the foundation that SPS created and therefore not contribute to building a giant monolith. And this was an important thing for us. Taking a small deep dive here into what exactly it looks like to have an instance of any service, be it TFS, be it SPS, code search, RM, any of them, they basically look, they're all like this. The only question is, how scaled out are they? So a typical scale unit might be four DV2 instances of, again, a PAS web role. Job agents, let's say three instances of a DV2 VM. And by the way, I don't think I explained exactly what the ATs and JAs do earlier. So an application tier is responsible for all the communication with clients. So when you fire up Visual Studio, you fire up a browser, and you connect to VSTS, you're connecting to an application tier. So it's got the REST endpoints, the SOAP endpoints, which obviously we're not writing new SOAP APIs, but there's still quite a few that exist as we go through the transition to get to REST. But that's the public face, if you will, of the service. The job agents run background jobs. And these, of course, that functionality exists in TFS. They're just not deployed as separate machines on-prem. But that runs background jobs. It could be processing commits. It could be doing any number of things. It could be doing cleanup. But they're background jobs. And that's what it does. So our largest scale unit in TFS has about 11 D4ATs, 32 partition DBs, with about 40,000 accounts per DB. And so, yes, it's about 1.3 million accounts in SU1. These days, we don't let the scale units get quite so big. We tend to cut them off at 7,800,000 instead. And then we've got about 120 terabytes in blob storage in just that one scale unit. So there's a lot of data in that scale unit. But that gives you a sense of scale for at least the largest one. So the next piece, I'll give you some stats here to kind of give you a sense of the service overall. As I mentioned before, we have 31 services in VSTS today. We're in 15 regions total. And like I said, TFS is the one that has the most. It has 15 regions. Not all the rest do. We have 192 scale units total. If you add up all these 31 services, all the scale units associated with them, I guess I was wrong. I said 15 for TFS. It's actually 17. My bad. In terms of cores, we have about 7,000 cores, CPU cores, used in VSTS. If you kind of look at production all up. TFS itself adds up to around 1110. We have 659 SQL Azure databases. Of those, 426 are standard, 233 are premium to give us, again, a sense of how much we use SQL. We use it a lot. And that adds up to about 63 terabytes of databases, just metadata. That's not file content. There's a nice bill there. Yeah, it's not cheap. This is not inexpensive, which is why. And you'll hear about this also. I think somebody's going to talk about it. It's why we focus on the cost of running the service. Every account, you get five free users. We run a service that's mostly free. Therefore, cost matters a lot. Question? So are the PaaS databases scaled up automatically, or is this a manual process? So today it is a manual process. So we literally run a report that looks at the load on the databases. We have metrics to ensure they're healthy. If they trend above the line, then we go and scale them up. If for some reason they come down, they need to be smaller, we'll bring them down. Usually the ones that come down are, let's say, from a data import or something like that. Typically a database that, once the load grows, the load doesn't suddenly shrink. It would only be if we moved an account out of that database or something that would cause it to shrink. But today, it's still manual. It's something actually that hopefully the next several sprints will actually start to get automated. And of course you have to be a little careful, right? Going up is reasonably safe. Going down can be dangerous. So you need to make sure you've got good safeguards in place. Well, thank you. You're welcome. Question? Is there a dependency or a cost factor that keeps you on virtual machines rather than choosing ad services for those backend jobs? So for the backend jobs, you talking about the job agent? Right, why do you have to be on VMs? Can you get away from VMs? So good question. So today they are virtual machines. We're gonna move to containers. I don't think that really addresses what you're getting after, which could we use a different Azure technology for it? We could. One of the interesting challenges we have is we ship TFS and VSTS. So we ship an on-prem product and VSTS in the cloud. We try to keep the code base the same. And so if we run it very differently in the cloud, it makes it harder for us to test the on-prem product and know that we have the right quality when we ship the on-prem product. And which by the way, is becoming a bigger and bigger challenge because as you saw in Aaron's presentation, most Microsoft is on VSTS, right? And it's only getting more and more so. Another year, 18 months or whatever, all the instances of TFS on-prem, which at one point were 71, 76 instances of TFS internal to Microsoft, something like that, they had heavy use. They don't have so much heavy use anymore. Some of them like DevDiv's still there, but people only use it for patching the legacy products. So part of it is the constraint of shipping an on-prem product and wanting to keep the code the same. In an ideal world where we're simply cloud only, we would take advantage of be it Azure Functions or whatever. I'd love to take more advantage of that, but right now I've just got to keep a good balance between my ability to use the service to test the code and being able to take advantage of sort of cloud-native technologies. Yes? So how did you understand the relationship with the rings and the instances? Okay, so Ed Glass is gonna go through that in detail, but basically what we do is we take the instances and we map those to rings. So ring zero's got one instance, ring one has one instance, ring two has two instances in it, ring three has four I think, and then ring four has about five or six. That probably doesn't add up quite to the right number, but roughly speaking that's the way it works so that every time we go to a new ring we're actually getting a much bigger audience. So if there's a problem, we wanna find a course in an earlier ring. The fact is few people as we can limit the so-called blast radius of a problem. So that's the whole goal of the rings and he'll go through it in a lot of detail. And I already mentioned, we've got 1.3 million accounts in scale unit one. So I touched on this in answering the question earlier, but I call it 90% the same code base. It might be more than that. It might be 95 or 97%, but it's basically the same code. I mentioned earlier the server framework and I'm gonna talk a little bit more about that in detail, but the server framework is, it was originally part of TFS like so many other things. This is what allows us to write a bunch of different services and have them all work the same way and we'll talk about what that means. That enables though a key piece here, which is that we can write plugins that allow us to abstract away some of the differences. So for example, authentication and identity, completely different in the cloud, right? Microsoft accounts, AAD with org ID, these don't exist on prem. So we've got plugins for the identity system and if it's running in TFS, it's using one plugin. If it's running in the cloud, it's using a different one. And when it comes to shipping TFS, it's those differences where we have to focus most on testing to make sure that, hey, we're not getting mileage on this in the cloud. Do we have it right in TFS? Storage, I mentioned that before, we split out storage between SQL and blob storage in the cloud, it all goes into SQL on prem by having a common server framework with plugins. We can then, you can go write a service like search for example or whatever and you don't have to think about how to store your data. We take care of it. You specify what kind of data is it, is it metadata, is it file data? And then the underlying framework takes care of the details for you. So the server framework that I mentioned before, we call it VSSF, Visual Studio Server Framework or Visual Studio Shared Services Framework. And again, we extracted this from TFS. It's common to all the services that we've implemented. And it was interesting, last night I was having a conversation with Tom Moore, who you'll hear more from tomorrow on to live site and telemetry. And he said, you know, we really don't fully appreciate how valuable that framework is. He said, I used to work on, cause he runs a service delivery team. He said, I used to run another service at Microsoft. It was a set of services. And he said, every service was different. So when it came to managing secrets, they were all different. When it came to deployment, they each had their own special way to deploy. He said, everything was different. And we just spent so much energy and lost so much time on that diversity of ways of doing fundamentally the same conceptual thing. But in reality, of course, it was all done in a different way. And this allows us to get some economies to scale. We have a few teams that work on the framework and everybody else in the organization is able to build on that and not have to go reinvent that wheel. So key examples, well, I'll give you some key examples here in a minute, but it forms this abstraction layer that allows us to have this consistency between the two models. And again, across the whole set of services, 31 services, all running the same way. And from an engineering standpoint, I don't have to go fund developing the same thing over and over again. And this sort of operational consistency, not only does it save us money, cause we're not having to have the same, different engineers go do the same thing, but from an operational standpoint, it means there are fewer opportunities to get things wrong. It also means that different people can go work on different parts of the service. Tom told me, he said, and the other problem they had with this set of services is when somebody changed teams, they had to go relearn how to do stuff on that service. When his folks, his service delivery team, if they wanted to go work on or needed to support different services, let's say somebody's out on vacation or whatever, he said, it's much harder because again, it's different for each one. So this consistency has been extremely valuable to us. So what's in this framework? And I'm not gonna go through all these things, but it's a long list. And this is just scratching the service. But a couple of the key ones, I've already called SQL and Storage out as being key. Another key piece here that I'll talk about later is secrets, handling secrets. We'll talk a little bit about that in the security presentation. Tracing is common for all of these services, which means that everybody understands exactly how the telemetry is gonna flow through the system. And we get resiliency. Circuit breakers and throttling are built into the framework. Every service can go adopt those without having to rebuild them. And they're not simple. And you'll see with throttling, for example, surprisingly complex. So taking a brief moment here to talk about telemetry in particular, because this was a big mindset shift. We were an on-prem team. The notion of telemetry was a foreign concept to us, because we sell you the software, you go run it. We don't get to go grab telemetry from your server. So many of these servers, of course, they don't allow all that stuff to come back up into the cloud. So this was a different thing for the team. It was also different in terms of debugging. So often, somebody called up with a problem, we'd wanna go create a repro, right? Hey, there's a problem. Ah, can we go reproduce it locally? Well, when you run a cloud service, if your customers are suffering, nobody wants to wait for you to get a repro live in some debugging environment where you can step through breakpoints and figure it out. You really have to change to think about, can I debug the service from the output of the telemetry that the service produces? So it became this mantra of trace everything. And as Tom will tell you about, that's both in good and bad, because at some point you can drown in your telemetry. And he'll talk a bit about how we deal with that. But every method, for example, when you make a call, you fire up the web browser, you fire up Visual Studio, you're making calls to rest and soap endpoints. We wanna trace all those. We wanna also look at cache behavior. So I'll take a smaller side on caches. You probably all know what a cache is, but I have a different definition of a cache. A cache is a live site incident generator. So if you look back over the last six years, so we stood up the service reel in April 23rd, 2011. So in the last six and a half years, if you look back across all the live site incidents we've had, the single biggest thing in common is caches. And we've hit all manner of bugs. Inconsistency, strange behaviors, you name it, it's happened. And every time somebody says, hey, I got a problem, be it performance or whatever, typically performance, I'm gonna go out of cache. I'm like, great, you now have two problems. So having this cache behavior in telemetry is incredibly key. It's also, by the way, there's a cache class in the Visual Studio framework. We had so many problems with it, we said, hey, look, let's at least build a foundational cache component that has the core telemetry in it about eviction rates, about the hit success rates, et cetera, so that you can at least get some core basic information about every cache that we put in the system. But it's incredibly important. And caching is so, so subtle, so seductive, right? I'm gonna go solve the problem, I'm gonna add a cache, it'll be way faster. Cache fails, you turn it off, you got a different performance problem now because the system doesn't actually work anymore without the cache. Caches, I can't emphasize it enough, they're really dangerous. You need them, you can't live without them. They are super dangerous. And then, of course, you wanna trace errors as another piece to trace, but you also wanna include the inputs. Inputs are really key. So this notion of trace everything really was a big switch for us. The next piece is you need to be able to control this, right? Cause you, some tracing's expensive. You wanna trace some set of things, but it's this large object or it's something you wanna compute cause you don't actually wanna trace the raw thing for whatever reason. Sometimes it's expensive, so you need to be able to control it. So we got four different levels, error through verbose that can be turned on individually, errors on all the time by default. And in order to control these things in tracing, we have this notion of trace points. And they're literally just lovely numbers like this. They don't have meaning, but you go, you're having a life-side incident. There's something going on, let's say, for identity. They go look up and go, I need this trace point cause I wanna see what's happened with this descriptor for security. Go turn this trace point on, so you take that number, go turn that particular trace point on. This one, of course, is at trace level info, so you may have multiple uses of this particular trace point with increasing levels of verbosity. And as you dig into the problem, you can decide how much data do you need in order to solve this problem. But this has been hugely valuable. We also have variants of this where we, in fact, conditionalize it so we don't do the extra work for tracing if the tracing's not turned on. But this is, again, a big shift in mindset for a team that's used to box product software, on-prem software. I mentioned we have a ton of logic in SQL. We have something on the order of over 1400 stored procedures. It's a lot of SQL. As a result, you need some way to explain to people how should you write SQL? How should you write SQL that works on-prem, in the cloud? How do you get consistency across the product for a variety of reasons? So here are some of the key principles in our SQL guidelines. And I won't go through all of them, but the first one is particularly interesting. The cloud and the on-prem product have different characteristics. There's no notion of an account on-prem. It's in the code, but it's effectively hidden from you. When you do an upgrade in the cloud, you're upgrading, I mentioned earlier, 40,000 tenets in a single database. That means 40,000 collections in a single database. You need to have your upgrade scale. So thinking about these things up front become very, very key. So in the SQL guidelines, it talks about all the sorts of stuff. It talks about even at the bottom of my list here, partition ID for partition for different tenets. There's a set of things that we want people to follow. There's a long list. It's actually something that I even asked, hey, do you think we could publish this? We may at some point publish it. Because I think there's some real value there. But so much of it comes back to the consistency. Doing things the same way so that every time somebody has to go look at, let's say, a life-side incident, they don't have to figure out, oh, well, this team does SQL completely differently than we do. What does that mean? It's not a good time to learn that in an emergency. Sorry, but just before you go on there, with so much logic in stored prox, how do you guys test that? So, good question. A lot of testing is, so we're trying to decide how deep to go into that answer. So, Manil Shah, this afternoon, is gonna go through quality and testing in general. We have different levels of testing. Everything from what you might call a pure unit test, though we give it a really boring name to avoid religious debates on what a unit test is. Call it L0s through L3. And so there's a level that he'll talk about called L2s where you can test through REST APIs. But since we have so much logic in SQL, we said, hey, we actually need the ability to test some of these SQL stored procedures as quote unit tests. I mean, and it's kind of a stretch, right, for some people to think of that as a unit test, but level one is like level zero, except it allows you to use SQL. So, we have SQL specific tests. We also have tests that drive the SQL because you're calling REST APIs and so forth. So, I'll leave that question really for his presentation, but it is an important thing for us. So, REST APIs, I mentioned earlier, when TFS began, and I was writing code on TFS back then, we were doing SOAP APIs, and SOAP was cool in 2003. It's 2017, SOAP is very uncool. So, we started the shift a few years back over to REST. One of the things we did with the SOAP APIs, which we can debate whether or not it was a good idea or not, but we said SOAP APIs are effectively an internal implementation detail. You're not meant to use them. Go use our .NET object model. Fast forward, the world's shifted. People want to call these things from, you know, Macs and Linux and everything else in the world. The best way to do that, of course, is with REST APIs. Everybody moved over to REST APIs. So, the REST APIs, unlike SOAP, are very much a part of our public face. We build a platform. You need to be able to call the REST APIs to be able to write extensions that do any number of things, other services that call into VSTS. Having been part of the problem with SOAP APIs, if you go look at our SOAP APIs, you can tell, looking at it, that each one was done by a different team. It's very clear that the version control team and the WIT team, the workout and tracking team, did not agree on how to do SOAP because they are completely different APIs and how they work, it's just crazy. And you look at this and go, did they ever talk to each other? Well, I helped create the problem, so I can poke at it. With REST, I said, and I don't do this often. I don't often put in review processes because in an agile world, in DevOps, if you own a service, why should you need to go to some manager to get something approved? I put in place a review process for REST APIs because the only way you get consistent APIs is that you force that consistency. Left to their own devices, every developer, quote, has a better way. And the problem is, it might be better, it might not, but if you as a developer have to go relearn a REST API every time you go crack it open, super frustrating. So, I'm not part of the REST API review process, I have people who know REST really well who do it, but this has been a key part to try to give you a set of APIs that look like they're from a single team and not from a bunch of competing teams or whatever. Now, you can certainly find places where we've not succeeded at that, but overall, it's been vastly better. We also didn't invent our own standards. There's a set of Microsoft REST API standards and the notes for the slide, you can find the link to it. You can Google it if you like. But the Microsoft REST API standards were derived from the Azure REST API standards and Office REST standards. They each had their own. So they got together and created one single standard. We use that, but there are parts in it that are optional or there might be two choices of how to do it, in part because it came together from Azure and Office. They didn't necessarily, let's say, fully agree on everything. So we get a little more concrete about some of it. So when I say it's our standard, it's the Microsoft standard with some concrete parts where it's optional. And we documented this and the whole process on our Wiki. And this is important. So these APIs can be reviewed through email if they're simple and there's no, I mean, the committee looks at it if there's no big discussion, great. Respond and email, you're kind of done. You might be breaking new ground. Maybe you're trying to do something totally different. You're struggling with resources and verbs and how to map things to your area. Great, have a meeting, go through it. But this process has been really valuable for us to generate an API, to create an API that's much more usable and more consistent.