 Hi everybody. We're here to talk to you today about our motivations and experiences in building a GitOps driven unified control plane at Autodesk. Because this is technically a lightning talk, we're going to be moving pretty quickly, so be ready for lots of slides to come at you. Before that, though, I'd like to quickly introduce ourselves. Do you want to go ahead, Cole? Sure. Is this on? Yep. Hi everyone. I am Cole Duclo, and I'm an architect and principal engineer at Autodesk. I've been focusing primarily on our internal developer portal and deployment platform. Hi everybody. I'm Greg Haynes. I'm a software architect as well at Autodesk, focused on developer enablement and software delivery. My background's mostly in cloud open source software ranging from OpenStack to Kubernetes and KD for a while. For those of you who aren't familiar with Autodesk, I'd like to start with a little bit about what we do. We build a large suite of design tools that help design the world around us. These range from best in class 3D animation software to use emotion pictures and all the way to widely used civil engineering and construction software. Our role on the developer enablement team is to ensure that our developers can continue to deliver this high quality software on a rapid and, sorry, that they can efficiently deliver this high quality software. To start I'd like to describe the beginning of our developer platform. This is a very high level view. It should look familiar to many of you who have started a similar developer platform, but it basically grew out of tying existing CI CD tooling into our source control system GitHub in this example. In this case Jenkins was directly tied into GitHub and users interacted with both Jenkins and GitHub to deliver software to our cloud environments. But over time our CD needs diverged from CI, as I'm sure many of you have also experienced. We needed some complex functionality to implement promotion workflows, auto-scaling, and solve regional deployment strategies. To solve this we adopted Spinnaker to manage our CD processes and kept Jenkins owning CI. But regardless of the specific tool choice, we now ended up with another user and software interaction. This separate CD tool, Spinnaker, which has even greater depth of functionality than our Jenkins. So shortly afterwards the scope of our developer platform grew beyond application delivery. We needed to provide compliant infrastructure management, cloud accounts with networking, and generally own all this increasing functionality that our engineering teams needed to ship software quickly. Now we have the suite of in-house tools on the bottom right here which they integrate with Spinnaker and that our users also need to interact with. These tools cover things like account networking creation, service ownership, and access, security, and compliance requirements. And the result is now that our users in our CI CD platform have this in-by-in integration and interaction matrix. And while this evolution has been necessary and has enabled us to scale, it is essentially it seems like we've created a large complex system that really the complexity and integration cost is being paid for by our users. So stepping back and with my architect hat on this looks like an architecture problem. Fortunately there are well-known architectures for simplifying this integration cost and enabling composability. I'm talking specifically about service-oriented architectures. So what would our ecosystem look like with a service-oriented architecture? At the center we'd have our service bus or what we call our control plane. At the bottom we'd have capabilities added by both our in-house tools and cloud providers. At the top we have our various software lifecycle tools which make use of these capabilities and they're exposed via the control plane. Going deeper we need a common way to expose and interact with these capabilities. Desired state as we've heard many times in this conference is an ideal way to design these interfaces and doing so enables composability of encapsulation via for these capabilities. It also enables them to own the lifecycle of the features implemented by them. While this not only is a common or a common pattern for service bus we also have a common language and design methodology for implementing these capabilities shown by how we have this user with that desired state on the far left here but also all these tools are speaking the same desired state language. This is sort of the power of a unified declarative control plane powered by get-ups. Fortunately this isn't something we've had to build entirely in-house. This is the new use case of using Kubernetes API as a control plane that we're diving into and its common language is CRDs which enable the creation of APIs to define declarative and desired state. Users define application CRDs on the top left there while lifecycle tools define and interact with job deployment and other more granular CRDs and all this is enabled by the Kubernetes API server which centralizes the implementation of boilerplate and common patterns to deliver these capabilities. This is the exciting journey that we embarked on at Autodesk about a year ago and although we're still relatively early in this effort we've learned a great deal about its strengths and weaknesses so with that I'd like to hand off to Cole to discuss that topic in greater detail. Thanks Greg. So what are the implications of going from an imperative to declarative here particularly at a large organization like Autodesk that has many heterogeneous requirements? Declarative means always on convergence. Desired state is going to be enforced by the platform so we're introducing new capabilities and new responsibilities to the platform resulting in any sort of new behaviors and patterns that our engineers have to get used to. We're really introducing a whole new way of thinking about delivery and deployment at Autodesk. It's almost as if we're asking our application engineers to learn a new programming language or even worse to rewrite their applications in a new programming language and a change like this is monumental. We understand it's extremely hard so we empathize with our application teams. We've already asked them to migrate from our pre-existing deployment platform to our current one our newer one. Can we really ask them to change again? Probably not at least not anytime soon. They're already fatigued and we know that fatigue is going to lead to them not trusting the platform any longer. So speaking of trust we're asking our application teams to adopt higher levels of abstraction and more standards and they're no longer going to have these knobs that they can control or turn like they used to before. We're asking application engineers to trust that the platform is going to make the right changes for their applications all the time constantly. So we're ultimately asking them to relinquish control back to the platform. Our deployment pipelines will no longer have a terraformed plan that they can observe and approve to apply changes to their infrastructure. App teams can no longer yolo changes in production through the AWS console. Incident firefighting is going to have to happen through our platform as well. So this feeling of losing control is completely terrifying. So a control plane like this is incredibly powerful. We mentioned this early on in many other talks. You can insert any of the sci-fi or superhero quotes here. Danger, Will Robinson, with great power comes great responsibility. They're all applicable. So how do we prevent application engineers from accidentally introducing damage or incidents into their applications or products? How do we prevent bugs being introduced by the platform itself and causing unintentional damage? So I know what you're thinking. We've all been there. Uh-oh. This seems like a very risky task. Is it really worth it? And we felt the same way. So there must be a way that we can mitigate these risks. There's got to be a way. How do we do that? Well, at Autodesk, we plan to take a very slow and deliberate approach, incremental. We want to make sure that we're crawling before we try to launch to the moon. We want to ask for feedback directly from our application engineers so that we can deeply understand their concerns. And we want to look for ways to mitigate those concerns along the way. We want to start with the least risky resources at first and progress. Uh, and that way we can validate the vision of the platform early on. And we can iron out all of the kinks that might pop up. Uh, and we don't want to expose any of these changes to our product teams, uh, until we are confident in the solution. So in order to avoid frustrating our engineers along the way, we aim to maintain the same or a near similar, uh, or near identical experience the best we can. At least for now anyway. So this means that we don't want to change any of the APIs that our product teams are interacting with and rather we prefer to translate them behind the scenes. We want to use the same tooling that our application engineers are used to today. So for example, in our case, we use Spinnaker as the main, uh, delivery platform and delivery tool that our application engineers have grown to use over the last year or so. Uh, and it's taken them a while to get used to that experience. Spinnaker also provides a lot of value for us, uh, when it comes to promotion as well as things like blue green deployments, metric based canaries. So by achieving a familiar experience for our engineers, uh, this will hopefully avoid that change fatigue that I talked about earlier. So with all of these risks and mitigations in place, is it still worth it? Should we tackle this beast and transition to a get ops platform or a unified control plane? Will this delight our customers in the long run and will it help us achieve our organizational goals of increasing developer productivity? So the short answer is yes. In order to succeed, we need to go from our current state, which is this complex interaction matrix that Greg described to this, a most necessary change to ensure that we can scale to meet the needs of Autodesk to achieve continued compliance and to unlock developer productivity and allow for other organizations within Autodesk to contribute back to the platform and help us grow. So thank you. We're ready for questions if you have any. I think we have a few minutes here. Go ahead. That maybe developers are used to and some of the new features that you would like to introduce that maybe don't translate as easily. And if you have encountered that, how you have you dealt with it or how do you plan to deal with it? If you ever get to that point. So yes, we want to validate the idea of the platform. That's really the idea of translating these things behind the scenes without disrupting experience. And once our platform team builds the muscle of developing CRDs interacting with Kubernetes APIs, and we start to feel confident in this platform solution, then we're hoping to build automated ways to allow for our engineers to accept these new interfaces that we are developing. So we would create a pull request in their repositories, for example. And it's just as simple as merging them in and then hopefully the platform takes the right branch. And for us, there's also so to get very specific, we have this thing called an ADF, which is just like some YAML files that is what gets translated into these CRDs. And the ADF, which is the in-house thing, lacks functionality. And so you end up with it's like the same legacy software cost problem of like, okay, well, we don't have as good versioning and for migrations and things like that. So it's a matter of just drawing the line. It's like when do those features become extremely important and when does the cost of maintaining legacy software outweigh the benefits? Awesome. Thanks. Because someone over here, you just go for it. Hi. So you were mentioning about how your aim is to remove those like knobs and levers for the developers and like normal like directly change it on the production situation. I'm curious what was the journey you went through to move to that and what you say about because like my org wanted to do something similar and the pattern that we found like, okay, maybe send us architecture. But then the argument against it was saying like, well, not even the block innovation, now like developers can't play around and kind of test to kind of iterate on that innovation itself. So actually, did you enter that friction when you went through that journey? Yeah, we definitely faced a very similar sense of friction. So any platform that we develop has to have some sort of extension point to it. So we do offer pipelines in Spinnaker that allow for more flexibility by basically bringing your own terraform and things like that. So that's one aspect or one way that you go about approaching it. But we really are doubling down on the idea of standardization so that we can make investment in tools, pointed investments and tools that can allow for application engineers to really focus on driving value for our customers. And so I think behind that becomes our goal to kind of evangelize those standards and get people bought into that idea that, hey, maybe less decisions around tools that you can use afford you more time for innovation and creativity. And to back up what he's saying, the business justification, it's less about taking away access. A lot of what we've done is justify it, for example, like we need to do a lot more regionalization work. Well, that means standardizing delivers some business value where we can own regionalization more at the platform level and do it more. So it's more about like adding value than taking, we haven't really gotten it as much from the taking away control angle. Cool. I think there's a talk pretty quickly after the summit. But thank you. Thank you so much.