 Hey everybody, this is Kasper and I'm looking forward to speaking about Dev Ops is Dead Embrace platform engineering here at the CNCF webinar. Let me first start with, you know, remarking that the title of this presentation is provocative, and I don't really mean it. Dev Ops isn't Dead Platform Engineering and Dev Ops are actually best friends, but wouldn't you agree makes for a great title. Now, I am all about platform engineering, really. I am working with and contributing to the platform engineering communities wherever possible, and I encourage you to do the same. It's such an interesting, such a thriving community, and there's so much going on. There's internal developer platform platform engineering.org. There's a large slack group. They're the engineering meetups. There's platform.com and I myself, I work for Humane Tech in my day job, and basically at night, more or less here in Europe for the community. If you have any questions, please feel free to reach out at kasperathumanetech.com or follow me on Twitter at kasperofficial. Now, let's actually go back to that question. How and why I'm choosing this title on the one hand, because it has obviously attracted your interest, but there is more to it. I think Dev Ops in its current form, if you want, the Dev Ops that I'm seeing practice in many, many organizations is a victim of misinterpretation. The Dev Ops, you know, in its core essence, is about culture and it's about people management and it's about aligning people alongside a joint delivery process. But Dev Ops is not the idea of taking everything that has been in specific silos before and just throwing it at the individual contributor without any filter, if you want. I really believe that that is not a very responsible or clever thing to do. And I think it's producing a lot of burnouts, frankly. It's very bad for the mental health for many, many people. And it's slowing down the productivity of the respective teams, frankly, just leading to a lot of frustration. Now, I think one of the things that we have to acknowledge is that Cloud Native is dirty, it's chaotic, it is complex. And developers just waste a lot of time operating apps and trying to make sense on how things fit together. If we think about the applications that we operated 10 years ago, they were a lot simpler. They had less global scale, less distribution, less users, monoliths, less tools. Everything is becoming much more complex. And I don't think that this is because we just like to procure new stuff or try out new technologies, which is often used as a reason. I think it's because the world we live in and the amount of users we serve is global, is complex in itself, and our applications need to meet that demand. Of an ever-evolving, ever-more complex globalization, really. Now, I always think that this is the core point of all of this. And this is why I want to emphasize this again and say that not DevOps is dead. That will probably remain to be the philosophy or concept and the guiding star as long as we are building software. But you build it, you run it at all costs. That is dead and it will not deliver results. And in fact, it will lead to that burnout of your teams. Now, the answer to I think that is platform engineering. Platform engineering, if you want, is the implementation of good practices of DevOps to a certain extent. Now, let's look at a little graph here. The pink curve is a proxy for cloud-native adoption. In the end, it's just container adoption that I believe is a good proxy. And the yellow curve is the rise of internal developer platforms and platform engineering. Now, if we go back a little bit in time, 16 years by now, we have a noteworthy presentation by Werner Vogels, who I think still is the CTO at Amazon Work Services. And he proclaimed this, you build it, you run it. Now, if you think back 2006, the you build it, you run it applications that we just discussed were a lot more simpler than the ones that we are confronted with today. And that means in the context of the time that claim of Werner Vogels made a lot of sense. Yes, you build it, you run it is a good idea. But as I just suggested, cloud-native has become a lot more complex. And that means as this cloud-native trend evolves, we're seeing more and more teams look at this and say, this can't be the answer. We're actually overwhelming people. We're not helping them. We're making things, we're just throwing stuff at them. We're making things more and more complex. And then in 2011, there is cloud-native at Google. And as they're observing, oh, things are getting more and more complex. They start with actually building platforms and rolling them out at scale. And you can see that five-year lag function from cloud-native to, hey, we need to change this in a lot of different teams. Salando in Europe, Github in the United States, where Jason Werner moves over from Heroku after being their senior vice president, becoming the CTO at Github and then starting to build an internal Heroku based on Amazon Web Services. And that five-year lag function holds true in almost every organization that I'm observing after five years. They're saying, oh, you know, it's not as easy after all. Let's not forget that Kubernetes on EKS, if you want to stick in the Amazon world, has been around since, I don't know, three, four years. So it's still very fresh. And so now more and more teams are saying, OK, this can't be the truth. We have to simplify this. And this is why platform engineering is picking up so rapidly. It's not because Gartner made it a top trend for 2023. No, Gartner made it a top trend for 2023 because it's a response to that complexity that we are currently confronted with. Yeah, that's why everybody is speaking about platform engineering. And that's why we're here today. Now, what is platform engineering? You know, why do we build these platforms? Well, I think in the end, it's about being fair towards different parts of the organization. You know, let's look at the platform, the application developers and then the platform operations as really teams. For the application developers, it's really about reducing waiting times, reducing cognitive load, making things easier to consume, enabling self-service, making it really simple to change configurations effectively, making it really simple to request an S3 bucket without having to learn yet another technology. Or, you know, maybe you can even, you're good at Terraform, but you just don't want to, you want to focus on the application. You're just not trying to understand how that particular networking is now implemented. I think it's very important that we want to achieve all of that without abstractions. The developer, we shouldn't go back into a world that we had in 2014, 2015 where, you know, we had these platforms and they were black holes and you had no idea what would happen under the hood if you would send a certain command. So, modern platforms are golden paths, opaque abstractions. We want to reach this while not taking away context. If you think about a team like a Google, at a Google, you say, hey, I need a certain resource component and then the system will say, OK, what's your context? Are you deploying like a change of that app to that environment? Well, I'm going to match you that resource. And that feels like, OK, central Google teams are deciding which resource to match, but they're actually very, very good at giving you the context. Hey, we're choosing this resource and this is the way it's configured because of ABC and if you want to apply a change, then that's the team you can speak with or you can actually apply that change yourself if you want to go deeper. So I call that layered abstractions. Let the user decide what cognitive load they want to expose themselves with. Now, the second team that actually touched by this is obviously the operation team, platform team, SRE team, depending on what latest busway you're using. And for them, it's really about standardization by design and frankly, reducing repetitive tasks. I think it's a good time to look at platform engineering if you have a Gira board that is flowing over with, hey, can you help me debug this? And hey, I need a new environment here. And at this point, platform engineering might be something to look at. Reduce ticket ops and then actually allow these teams to focus to, frankly, hit their SLA and improve the things that they're and practices they're using right now. Now, what's an internal developer platform? Well, it's pretty simple. It's the sum of many components that form golden paths for developers. There is not the IDP. An IDP, and that's for many people frustrating that want to have easy answers to complex questions. An IDP for a health care provider in South Africa might look very different to the financial service company in Norway or the e-commerce company in San Francisco. But it's really about making sure you have reusable components. You wire all your dev tools and orchestrate them in a structured manner. You allow the development team to consume that with low cognitive load and you allow the platform engineering team to automate that to actually reduce human interaction. And that, in combination, actually helps you operate the infrastructure in a repeatable manner. Now, I'm from Germany, so I obviously need to think about cars all the time. And I have this analogy with the car manufacturing industry. Immatural organizations tell teams to build a car, give them a credit card, and then tell them where to find a store with raw materials. And then when they struggle, they send in people to help them organize, and then you call that DevOps. Now, mature organizations tell teams to build a car, give them a credit card, and tell them where to find the raw materials. If teams struggle, DevOps helps them organize, and Cloud Operations team help them find the right materials and prep those for them. And then advanced organization tell teams to build a car, the platform team prepares a platform, the developer is built on top, and DevOps helps them structure their work. And if you think about this, and that's something that Jason Warner said, and that resonated a lot with me, in a good setup, developers and operations do not talk about transactional things. They speak about how can we improve the process? How can we make work easier and less interactive for all parts involved? But they don't speak about, hey, can you please spin up that environment? Because that just leads to waiting time and not actually produces any tangible value and outcome. But that's all good and fine. Congratulations. Yet another category, how about platform engineering? But where do you actually get started with platform engineering? And I've been fortunate enough to help a number of organizations at this point on their journey towards platform engineering. And there is a certain playbook that I've been picking up. And to a certain degree, a step-by-step guide that will help you digest or dissect how to actually approach platform engineering. As a first step, I think it's really about making sure you treat your platform as a product. Because a platform and an internal developer platform look so different, and no matter in what organization you look into, there is no one size fits all. And there is no recipe that you can look up on Reddit that will just allow you to get that as a sensible default. You need to shape this into and tailor that to the situation that you find yourself in. You will have different CI pipelines. And you will run different types of workloads. And all of that needs to be mirrored within the platform. Now, I think that it's also important to understand that depending on your industry, you will have completely different governance requirements, security requirements. It's a difference whether you work in the US governance sector or in an e-commerce framework in a basement of a Berlin hipster startup. Now, if we speak about platform as a product, what we mean is that you should treat it as a product. Assign a product owner. Maybe you can't afford somebody for a full time. But it's a side project for one of your POs. They should have a clear road map and G-Write tickets and or whatever other project management tool you're using. They should, a platform team has software engineers that can actually build stuff. You do user interviews. You have a user, right? It's an internal user, but it's a user. There is nothing user as a user. You should do user interviews. You should measure whether your user likes your platform. And you need to do things step by step by step and really see that above. So number one, which sounds trivial, but not enough platform engineering teams, do that. Make sure you treat your platform as a product. Manuel Pais and Matthew Skelton have said that for years. But it's so vital, so important. Really make sure that this is the way you approach and treat things. Now, number two, prioritization. I think many teams fall into that prioritization fallacy where they're letting themselves be guided by feelings on what to do first. The most obvious thing is they think about, what's the first thing a developer does when they join a company? Well, and they think, well, they're on board. Well, let's actually start with making it really simple for developers to onboard. Another behavioral pattern that I'm observing is that they say, well, what's the first thing that we do when we start a new application or service? And then the answer is, well, we're starting a new service. Let's say a spring boot service will take a template and then we'll clone and then we'll start to customize that. And just because that is the first thing you do with an application, what they actually end up building is a really simple way to start and build a new microservice. That might make sense in specific situations, but I mean, ask yourself, how often do you actually onboard colleagues? How often do you actually spin up a new service? And even if you do that very often, how much time does that take you? And even if it takes you a long time, isn't there a lot of other stuff that's less obvious and that's less anchored in your brain? That takes a lot more time and a lot more interactions from the better person operations, thus actually providing a lot more ROI, return on investment to the actual case. So the first things you think of, if you think about platforms and what you could optimize and really things that go beyond the simple update of an image, well, the first things you think about don't necessarily need to reflect the things that you should actually focus your attention on if you start on your platforming journey. I always propose the same procedure on how to actually approach this. Now, take a white piece of paper, sit down, ask yourself, well, how often do you go beyond the simple update of an image? And don't ask that yourself. Let me correct myself. Ask that your users, your individual front end developers and your back end developers and your operation teams and juniors and seniors, make sure to speak to a couple of them. How often are they adding environment variables? How often do they change configurations? How often do they spin up a new environment? How often do they onboard new colleagues? Then normalize that. So against 100 deployments, how often do they do that individual action? How much time is included from developers? How much time is included per individual action for operations teams? And then you know, sum this up, multiply that with your total number of deployments, and you have your arrow I case. Then just look at what's the largest number of hours that we spend as a company on those specific things. And here you go. You have your prioritization. Make sure you do that exercise. Don't just think you can sense that from your own experience. Very unlikely you can. Step number three, agree on the lowest common denominator text stack. It's unlikely that you can churn out something that suddenly works for everything. If you say you find yourself building a very long stack of everything you need to support, yeah, you will probably not be able to ship any product anytime soon. And who knows whether you still have your funding if you can't actually provide tangible results fairly quickly. So look at what's the most used thing, or what does the future hold? Is the future VM, or is the future Kubernetes, or is the future Lambda functions? I don't particularly care, but you should find your lowest common denominator and then start optimizing against this. In most cases, it will be containers and Kubernetes. But you know, everybody's individual. It's a managed service of one of the providers. Just make sure you don't try to do everything at once. Number four, you want to find your, I call it a lighthouse app, a lighthouse team. One team of people that are really innovative, that want to try new things, that are the ones that are always front and center. If it's, you know, if there's something new to try, they are really interested in doing this, providing feedback, testing stuff. And they should have an application that's exactly on your lowest common denominator tech stack. Take that team, right? And then work with that team very closely. As a next step, decide about the general architectural layout. We are seeing two patterns trending, dynamic internal developer platforms and static internal developer platforms. In the end, it comes down to the methodology of configuration management that you choose and apply static configuration management or dynamic configuration management. This here, the one that we're seeing right now is a static IDP. So you have your IDE to code, then you have static config files in your version control system, your workload, YAML files on an environment by environment basis, infrastructure as code. And you have your CI and registries to actually build the workload. And then you throw everything at a CD, at a controller, and then stuff gets deployed. So that's the approach with static configuration management and the general architecture. That's, I would say, the most common approach, but also the, I wouldn't call it outdated, but the probably less modern or less flexible approach. The second approach that we're seeing develop really fast and quickly now is dynamic IDPs, dynamic configuration management. That's the idea of layered abstractions. Still the same progression. You have your IDE to code. And then in your Git, you have a number of different files. You have your workloads. You have a workload specification that describes the environment in an agnostic way. So one file that works across all environments. And there is an open source project called Score that is dealing with exactly that. And workloads and workload specifications are the two things that developers usually deal with. So they say, hey, this is my workload. Depends on a database of type Postgres. Rather than saying, hey, on that particular RDS instance, it's just in an abstract way. And then that request from the developers get matched by with the platform or operations sensible defaults, if you want. So the platform team can say, well, if the context is staging, then please match that particular RDS or create that DNS. And there are still normal CI pipelines to build the workload. And then there is a workload, a platform orchestrator basically read that declarative application model and then say, well, where am I? Okay, environment of type staging. I'm going to create the actual config files, context specific. I'm going to match the infrastructure or I'm going to create new ones. And then I'm orchestrating the workloads, resources, and providers. The beauty is that the developer can decide how deep do I wanna go? How much cognitive load do I wanna take? It's a clear separation of responsibilities of deaf and ops without getting in their ways and it drives, yeah, standardization by design. So those are the key patterns. There is really like, if you've gone down one path, you could still change that. But I would say, you know, think about this before you start building. You know, then start building. I believe, again, platform as a product, small changes. Make sure you build the one thing that matches your return on investment. Think of the calculation that we did before and makes it 10x better. Even if it's only a small, tiny thing, but you wanna make sure that your evangelist team loves it and thinks, oh, this is so much better than everything we have before. Over allocate time on this, it's the small little details that really make your early fans. And then you can take that, those ambassadors if you want and you can start scaling this to the next teams. Actually expand your usage. Look at the next things that actually provide ROI. Never forget that there is no platform that's ready. It's constantly evolving. Once you've reached a certain point, there will be a new technology to weave in. So iterate, iterate, iterate. And I encourage you to get help and exchange thoughts with fellow platform engineers. There's platformengineering.org, platform.con as a conference. There are the meetups. There's the slack group. And yeah, I hope you make the most out of it and you enjoy your journey. Such an exciting time. If you wanna chat about this again, you know, reach out to me, Kasper at qnetech.com. And until then, thank you for listening to this talk and see you soon.