 Hello, DevXDays, and welcome to my talk today about Cloud Native Foundation for Developers and Platforms. My name is Max Karbecher, I'm from Liquid Reply, and I spent the past years about designing platforms, crafting platform engineering teams with various customers focusing on the application delivery, how to bring software code from the local machines to remote development environments to Kubernetes clusters, and also advise here clients and how to utilize and actually drive a little bit through the whole Cloud Native environment with its hundreds of open source tools. I'm part of the Kubernetes release team, basically in the evenings and nights, at least from German perspective. And I run with my colleague Christoph Degaszer, newsletter called Native Cloud Dev, which was published weekly. So you maybe have read one of these articles, and they're all about the term platform engineering, and what this actually means. And I think there's at least 20, 30 other articles which are going more in depth about all these topics. So we want to talk about what does actually this platform means and what is the engineering part in it. When we are talking about platforms and all these articles are talking about platforms, it means Kubernetes, most often. In my opinion, it can be also other things like Nomad or I don't know if Mesos is still a thing or OpenShift or whatsoever, but overall, it is a tool which abstracts away the infrastructure. As I already said, application delivery is a huge part for my job, but also what is nowadays very important. How do we get the code to the environments? How we can ensure a good developer feedback cycle? How you can get insights into your tool and your performance of the tool? And then for us, it's very important to have an automation. Automation is king, but we need to nowadays develop an automation in various kinds. A developer maybe wants to speed up fast the cluster to test something. This should not take long, but it also doesn't need to last long. On the other hand side, you have clusters which was the same principles, needs to live very long. They need to live a longer time. And then everything should be declaratively. We utilize APIs, throw our manifests against it and tools taking care about to provision what we have requested. The foundation for all of this is that we basically come to a trustful, reliable infrastructure which can be utilized by development teams, but also by operational teams to run software, to run your business logic, somewhere in production. Now, platform is a very poor terminology. There can be a lot of different kinds of platforms. We are also talking about platform as a service, but well, sometimes there's a little bit more than that. And at the moment, one of the common trends is to actually call it an internal developer platform. It should represent that the internal development team is providing code through all the mechanisms there and to deploy the application. And that the platform takes away all the complexity which is behind it. DNS resolution, image records, trees, infrastructures and whatsoever. And sometimes not very happy about the terminology. I think it, for example, is a little bit shortcoming in the operational perspective, but at the moment I do not know a better appropriation for it. So please feel free to give some nice wordings in the chat if you have some. So when does this IDPs appear? Well, if you work in one of the cool Spotify, Netflix and Lyfts on this planet, you most likely have something like this, but you have it because you're IT driven. You're born out of IT. You're born out of automation on the software focus. And because of this capabilities, you build a product. Now, the most companies on this planet are not software companies, surprisingly. They have other background in engineering, chemistry, banking, financial services whatsoever. So they have a really long IT history and most of the time the IT history is way more older than the most of us. But they need nowadays also IDPs to orchestrate all the chaos, to move ahead because the way how software was developed in the past is not applicable anymore, at least not to keep speed with, well, all the cool young IT hipster companies. So when we're talking about an ideal IDP, we need to start on like, where does the platform engineering team come from? And what they want to achieve? I will not go too much in every detail point here. You can read it afterwards also by yourself. But important is who we need to support as a platform engineering team. And on the first place is always the development team. We need to reduce the walls, the hurts for the developer to join these platforms, having a single point of entry and to rely on it that whatever you need will be provisioned through the platform and you get immediate feedback to utilize it. I said also the operations team is a very, very important piece of it. And it's a little bit shortcoming in the terminology and the point of view of IDPs. Nevertheless, platforms needs to be also operated. This can be done by platform engineering teams but most of the time there's someone else in addition who runs not only the platform but also the application. So we need to bring all the good and relevant tools from the beginning into this discussion. And we need to utilize what the platforms allows to do like self-healing, restarting containers when they are failing, replacing whole infrastructure switching to other data centers if possible. Security, at least for me as a German point of view is another very, very big thing. We need to consider security from day one actually. And reality security is mostly applied in the end and then it makes it more difficult. I think with IDPs, we have a chance to apply security rules very in front to do all the heavy lifting of encryption, of threat detection, of integration of CM and take this all away so that you can pure focus on the main business logic of the application. And we need to also support somehow the business because somewhere we get also the money from. And on the other hand side, we need to also make transparent where the money goes to. With a platform as an equation of platforms, we can create a kind of transparency, show which development, which solution maybe costs you more or less, where you have higher reliability in the software and which services are maybe more frequently requested than you actually expected. So what you actually want to achieve is that from a development or application point of view or something that wants to call a product, your cloud platform and all your infrastructure is abstracted away. You maybe see the database entry points, something for message streaming, something for logging. You need to store some of your secrets, just the difference. And most of the time, you're very, very sure on which platform you're running. You know that it's a container, maybe in Kubernetes. You know that you install on virtual machine and so on. But what would be even better is that you get rid of these things, which are too specific. When you put your platforms right and your entry points are correct, then you can go one more step away from it. And no one needs to know anymore where it is running on, on which platform, which environment it is running on. Is it the OCI compliant container or something else? Now, this is a little bit future topic in my opinion, but it's something where we are going to, the only reason why we are not there is that you need to define nowadays still how the software should be deployed. And this is actually something which the people who write the software know best. So about some lessons learned, the little things I would call them. And you maybe will be surprised, but whatever you do, it's absolutely wrong. We will focus on the past that if you go the way left and we just say, okay, everyone can build containers and it's not a problem. You can just throw it through container registry. Suddenly a huge group of people appears on the other hand side and say like, hey, no, that doesn't work anymore for me. Or vice versa. And then there's the market. It is, the market is quite fast actually. And well, appears that you sometimes have some issues here because actually you're developing, the market develops the tools faster than you can replace them. And it's not about tools only, but also about all the ideas, all about the methodologies. We need to simplify the infrastructure, but we should not have the target to explain everyone how it actually looks like. You cannot train thousands of developers within an organization to understand every single part of the infrastructure. And this is not interesting for everyone. For everyone who's interested in this, great. You can do so, but it's not necessary to understand every little piece into the topic. And as I said before, you need to start early with security. Every project where we start really late with the security getting more and more harder to closely come to production. It's still possible to do so, but it suddenly this, well, pain in the room which you have when you enter the security part. When you do the other way around, apply security from the beginning, it's way more easier to do so. Now from a platform engineering perspective, there's a lot of complex things you have to keep in mind. This picture is actually a little bit older, but I still like to utilize it to explain to people what we all have to keep in mind. But yeah, well, take a look on it by your own. You mean there's it mixing topics, it's mixing responsibilities. It's all somehow connected with each other and somehow it's not connected with each other. And therefore it makes it complex to develop platforms. Let's look on three more interesting lesson learns which I took with me and which we try to explain to everyone whom we approach. Kubernetes is not a hypervisor. Well, that's not something which I've learned, but it's something that I've learned that most people understand who start utilizing Kubernetes. They treat it as a random virtual machine as a hypervisor. I try to always explain it this way. When you go and bet on Kubernetes, you have actually two choices. Number one, you keep developing your applications like you have done at the past years or maybe call them microservices. Or you go all in Kubernetes. You rely on the product, you rely on the platform but it takes over for you encryption, DNS resolution, observability tracing, traffic management, security, whatever else you want to have, user management, authentication, authorization, all the things can be taken away but you have to bet on the platform. And this is what the people not understand. They think like I will build my software like I have done it in the past but I'm bringing it to Kubernetes but I will not 100% utilize it because I do not want to have too much dependencies to the platform. And yes, it's a point to have too much dependencies. It's not good. But in this case, you need to see it as a whole ecosystem which comes together. And if it's done well, it's a very, very beneficial approach for yourself. Also, it makes it more harder if you try to keep doing the software like you have done in the past without very integrating with the Kubernetes itself because you need to invent all the time workarounds. Cloud providers matters. On the paper, all cloud providers are the same. They have more or less the same prices. They have more or less the same features and capabilities and whatsoever. We implement products and projects on various cloud providers and more than is written here. On hypervisors, it doesn't matter. And I can tell you one thing. Every hypervisor or every cloud provider has its own problems and you need to understand them. And you need to push for that. You really choose the best for your development perspective, for your infrastructure perspective, for your engineering point of view rather than maybe for the best deal. Just because you can save a million euro on the environment, it doesn't mean that the platform is the same good as when you have something which is a bit more expensive. For my experience, quality sometimes costs more. And you can see this also in some very specific places. Infrastructure is a code that's crucial for a platform engineering. It is most important, one of the most important things. Now, I love cross-plane. It's an awesome tool because it really fulfills all the modern needs of building platforms. But cross-plane, for example, has a very poor support for Azure because the Azure community also here is not very much contributing back rather than taking a look on AWS and GCP where you have a great amount of people behind supporting it. Now, I know that the guys are developing here and keep up with Azure but Azure at the moment have a really big drive on the market but we cannot adopt cross-plane or the whole cloud-native approach to provision infrastructure because the adoption is not so good. And there are a few other issues with the technology itself of Azure. Lastly, there's a mythos about CI-CD. Back, I don't know, five, six, seven years ago, CI-CD was this thing. Every company starts to implementing it and it was really helpful. The automation was great to take away jobs from ops teams, free them up to focus on the really important things to keep the infrastructure developing and so on. But it has also some, well, nowadays issues. We really need to break up CI-CD. It's not one single word, it's two parts. It's the integration and the delivery. Modern DevTools like Tilt which you can use for cloud-native development reaching from your local machine or your remote machine where you develop up to the infrastructure where you're going to deploy and give you a fast feedback cycle about what's going on. And that is so important for the development and you have to allow it. On the other hand side, as I said, I'm coming from a market where security is a very, very important point. So for going for a productive lifecycle, we need to make there a cut and setting a checkpoint or sometimes even the break point. Now, in all the projects which we have done, one of the best break points is the container registry. The container registry and the Git repository or the chart library wherever you want to define how to deploy your container. Well, it's like that. Well, we can apply there all those security features which we need to have, scan for problematic things, make a bomb, like builds of material list, check containers for bad configuration, check for bad dependencies in the software. And then still the autonomy and the flexibility and reliability to deploy, you need to go away from the classic CI CD pipeline thoughts and move on to GitOps because then GitOps take the certified, the ensured container which has a high quality and the configuration and throw it to the clusters where you need to be, but not only this. So GitOps is needed not only because of this breaking point which is required, but there will be also changes in the future and the way how we can deploy containers or that you will deploy containers at all. Just think about a web assembly. It's something which is approaching as fast but the community is driving hard into it to make it possible to utilize web assembly more and more often because it solves other questions around security and isolation, for example. So if you set up now already a pipeline which is based on GitOps, you in the future can replace very easily the part of deploying it or CI compliant Docker and replace it with any other kind of container format. And that's very helpful in this point of view to break this CI CD up. For sure you need a unified observability to implement it on every kind of infrastructure. The thing is that for every cloud provider you need to adjust it a little bit. You need to fine tune a little bit because everyone does a little bit differently. And you have to do it. An average, I think from the cloud native report from last year, enterprise has at least 2.4 cloud provider public cloud provider. In other words, the one company has two the other company has three cloud providers and this is the huge amount. So you need to find an observability way which is unified for any kind of platform. And this is needed to support development teams and to deliver the developer the full insight of traceability, the logs, the events, how does the application performs. Think about progressive delivery, metric based shifting from workload from A to B. You need to give the insights back. I think I talk enough about security. Sorry for annoying that. Maybe I should change the name of the talk we'll talk a little bit differently, but it is the same and start early with it. So a few things which also do not work so well when you're building platforms. Number one is the observability is not designed for platforms nowadays. If I want to operate and manage and observe the infrastructure, I basically always observe also the containers on top of it. But what I actually want to do is that the software which runs on top of it in its isolated environment has its own stack of observing, of metrics of logs. And this should be shipped somewhere else, not into the infrastructure itself. And it's very cost intensive. And it doesn't matter in which club provider you want to use native tools or an additional tool, they're all very, very costy. Mighty tenancy. Actually it's a topic which, well, it's getting somehow sorted out slowly step-by-step. But remember, containers are not secure. So the isolation isn't also. On the other hand side, yes, the user management is somehow sorted out, but it's complex and hard. And projects I've seen, user management is one of the most complex part in all of these things. And there's still no very good solution which can easily sort this out. And the metadata, we had built a, let's say Kubernetes distribution thing for an telco provider, where we have so much metadata which needs to be stored somewhere. And there were so many ideas about starting Kubernetes, but like, hey, Kubernetes is not the database itself. It should do other things. You can start in Git, but sometimes Git is getting too slow if you have too many requests. It doesn't matter which GitLab, GitHub. And then you came up with an own solution, but this is also not perfect. So these three things are not that very well done in my opinion yet, but I think also on the other hand side, this platform topic is still quite new and young on the market. So to finish this talk, there are a few key metrics which we keep in mind when we build platforms. And the first one is that the number of scripts you need to provision and maintain your infrastructure is the same or shows how inflexible you are. The more scripts you have, the more dependencies you create between tools. And that's not good. And this leads directly to point two. The more components was interdependencies, you have the less you can use your platform in the future. As said, the market has developed too fast. You will have to replace tools in the future. If you do not build your platform for changing the sum of the parts, then you have to rebuild the platform again. And this is the major issue. The most of the enterprises needs to do over and over and over again. Rebuilding platforms. And last but not least, my favorite one and also actually from one of my product owners which got in a product in a project bag as a feedback. If the handbook for the developers is too complex, it means that the platform is worse and that actually it's less likely to be used. Developer focus ends where your documentation starts. And if the documentation is too complex, well, you have a problem. Thank you very much. I hope you could enjoy it and have a great day.