 All right. Hello and welcome to my session on what makes, what actually makes, software cloud native as a pretty foundational topic. And I think a lot of people have opinions on this. But today I wanted to kind of drill into my opinion and what has kind of formed my opinion over my time in the space. So who am I? I am Jimmy Zalinski. I am the co-founder of a company called AuthZ. We produce a open source project called SpaceDB, which is a cloud native authorization database. Previously, I worked at Red Hat as a product manager on OpenShift and SRE. And then I also got started in the cloud native ecosystem, literally at the beginning at a company called CoreOS, which was ultimately acquired by Red Hat. At CoreOS, I both started and contributed to a bunch of foundational cloud native projects and kind of helped to pioneer the space and the vision for the space later in a product role. I've also been a contributor to the OCI standard, which is the standard for container images as well. So a lot of people will try to tell you what cloud native is. If you look up this term, you're going to see every single cloud provider wants to define this movement and the tenants that create this movement. And this is no conspiracy. They want folks to understand cloud nativeness, obviously to build into their product. But I think at the end of the day, really it's so that the customers have success with their products as well. The more successful you are with the cloud, the more you all consume of it. So some of these are biased, but they all kind of are pushing you towards particular solutions. Sock wouldn't be complete without a meme. So there's the like, is this cloud native meme? So if you take some legacy software and you wrap in a container, is this cloud native? And I think a lot of people would knee jerk react and say no, but I hope by the end of this talk, you realize it's actually far more subtle of a discussion, whether you couldn't even consider this cloud native or whether it's not at all. Because certainly you get some value out of packaging something as a container, even if it wasn't designed for it. I think kind of the best place to start is what doesn't make software cloud native. And I think a lot of the time when you look at the cloud providers kind of definitions, they're talking a lot about like the ecosystem and the kind of movement around the space. Fundamentally, I don't think any particular technology makes you cloud native. So a lot of these things say containers make you cloud native for microservices or serverless or DevOps practices. I don't think any of that actually makes you cloud native. I think these are certain patterns, design patterns that might benefit from cloud or you may be more inclined to use because of the cloud. But that isn't fundamentally what makes or doesn't make something cloud native. Because ultimately what I believe is that it's not how it's built but what it's taking advantage of. And kind of with the previous example, you can always kind of force a square peg into a round hole and package up some very old legacy software into a container and try to make claims that it's cloud native. There's a quote by Joe Beta, who is one of the creators of Kubernetes. Shortly after his company, Heptio was acquired by VMware. VMware didn't interview with him and they kind of were trying to come up with what his vision is for the cloud native space. And I think he summed it up fairly well, better than most of the other resources you'll find out there. And he describes cloud native as somewhat of a question that forces us to alter our processes, tooling and fundamentally how we do things at an organizational level to take advantage of the unique aspects of the cloud, the new aspects that the cloud actually buys us. But I think a lot of folks are looking at software vendors and saying if I adopt this technology, will it make me cloud native? And I think that is fundamentally a role reversal. It is you who as the end user trying to consume the cloud that is telling other folks what is cloud native, what the vendors are, because what fundamentally makes you cloud native is are you taking advantage of those unique aspects of the cloud? So if a vendor is telling you yes, we're using us makes you cloud native, it actually might not depending on what value you're trying to extract from the cloud, not all quote unquote cloud native vendors might be targeting those values. And you could get in a situation where you're fundamentally misaligned with the software stack that you've selected, despite you picking up all quote unquote cloud native solutions. So kind of I think of this in three different ways. When evaluating whether I believe something is cloud native, and fundamentally at the end of the day, there is no line in the sand that makes you cloud native or not. I think it's kind of a spectrum and very kind of fuzzy. So I have if it looks like a duck and it quacks like a duck, then it's a duck quote here. Because that's kind of how I feel about cloud native. I like to evaluate things in kind of these three different terms. Basically, is it easy to deploy to a cloud platform? Can I get started? What does it take to actually get this thing running on the cloud? If it's a Julian effort where I have to like write a bunch of code and like really migrate things, that thing is not going to be cloud native. But then there is kind of running it. Does it run well? Is it taking advantage of the things that the cloud is providing you operationally? And then kind of like the second order aspect of all of this is that are you taking advantage of the ecosystem, all of the other tools that people are using to be successful in the cloud environments. So I'm going to dive deeper into each of those three. I think when it comes to deploying the number one assumption folks have is that it's packaged the way they expect. But cloud environments are actually fairly, fairly heterogeneous. They're pretty different. Some folks are using orchestrators like Kubernetes, but also there are other platforms that are very popular. ECS, for example, is a proprietary one on Amazon, the Elastic Container Service. And then there is even other frameworks like Juju from Canonical. And all of these fundamentally work with different types of packages. So you might be asking yourself, hey, well, like I know cloud native things, I should basically provide my application as a container. But for something like Juju, you actually might want snaps available. And 50% of new VMs spun up on the cloud are still running things like Ubuntu. So in that scenario, you probably want to deploy Debian packages or like classic if you're targeting Red Hat Linux, Red Hat Enterprise Linux, or even Amazon Linux, for example, you're probably going to want to package our PMs. And then if we back to even the more popular platform like Kubernetes, if you're just providing a container, is that enough? No, usually people expect Kubernetes manifests of some sort or something that they want to customize even more like a Helm chart. So there's kind of a lot to consider here for packaging. And I think it really depends what ecosystem you're trying to invest in, which one makes the most sense for your software. As a kind of putting on my hat as a vendor, we have tried to target the most popular things. I think actually every example I just mentioned we packaged for just to make sure that we cover all of our pieces. But we definitely didn't do all of them. At the beginning, we actually waited to see what folks are most interested in before we built those. But once you're packaged, folks can understand how they can install it. That's great. But the next step is what was it like to actually configure that thing to run it in a real environment? If we kind of think back to the beginning of the cloud era, Goroku wrote us this manifest called the 12 factor app. And in it, one of the major influences on the industry was that configuration should be read from an environment variable. This is kind of fairly standard practice these days. If you're deploying on something like Kubernetes, or even more proprietary platforms like reverse cell, you're still going to be able to configure environment variables that are injected at runtime that then configure your application. This is this has gotten super popular. But not all applications do that, some of them reconfig files still. And then even more interestingly, the environment is more than just environment variables. For example, there are some applications that will actually introspect, for example, they'll detect if they're running on Kubernetes. So for example, space DB product that we build will actually detect if it's being ran in a deployment on Kubernetes, and then discover other replicas in the deployment and start connecting to them and self clustering automatically. There's a lot of powerful things that you can do. If you write software that is kind of cloud aware and is aware of actually even the control plane level API is to be able to make integrations and detect the correct way to configure things that users don't have to configure things and they're more robust to environments changing as well. And finally, kind of like along those lines of configuration, there's the challenge it takes to actually deploy and upgrade the thing. For example, there is concept that actually I help pioneer a core OS back in the day called operators. And Kubernetes operator is basically software that knows how to speak to the Kubernetes control plane and reconcile both custom logic about an application running on Kubernetes with the state of the world in Kubernetes. So it kind of bridges this gap of like, what is the state of my application? What does it need and teaching Kubernetes itself? How it needs to manipulate that application to be successful. So lots of databases, for example, have operators that help them do zero downtime upgrades, or even just simplify the deployment process of deploying the software that might have a lot of moving parts and managing those moving parts. I think really the the bar I set for like a good user experience is also like what it takes to upgrade. If I can upgrade the software by changing a single line of declarative configuration just bumping the version number, or, or basically just running a single command, then I would say that that is a truly native good experience for my cloud environment. So moving on to basically how well does something run in the cloud? I usually look at these three kind of pillars for that. A lot of the initial cell for folks moving to the cloud was this concept of elasticity. So the idea that you can scale up and scale down based on demand for your applications. And I think fundamentally this is super critical if your software is unable to auto scale or scale basically by adding more processes. Then I think you're really missing out on a lot of the value of the cloud. The other kind of aspect to this is like what kind of instances does the software run well on? If you are developing something that can be killed very easy, is resilient, isn't going to be corrupted or lose data if it's destroyed at any point in its its lifecycle, then you can start running your software on cheaper instances on the cloud. There are instances called spot instances. And these are ones that can be terminated at any time. They're basically like you bid for them. And typically that maintain a price, a very low price point because the cloud provider is actually turning through them very quickly. And you only get them for so long before they come back up on auction. So if you develop software in this way, not only is your software just going to be more resilient to any kind of networking failures that happen in clouds, which by the way, happen very frequently in clouds. Clouds are not super reliable in this sense. But you also get to take advantage of a lot of these cost saving measures by using something like spot instances. Finally, there just are fundamental architectures that are the hardware available in the public cloud providers. Things like MB64 servers and ARM servers are kind of standard if you're talking about CPU architecture. So I have seen it, but a lot of people take a pre-existing mainframe software, package it up into a virtual machine, package that into a container, and then run it in a cloud environment. This is super inefficient. And I know they're not getting the performance they would get of running that software actually just directly on a mainframe. So they're kind of losing out in that sense. Maybe the benefit, like the juice is worse, the squeeze because then they can kind of abandon all of their mainframe kind of hardware and the overhead and maintenance of that. But I do think this still counts as like not really taking full advantage of the cloud. If you're really taking advantage of it, you might actually have to rewrite your software in such a way that it wouldn't be designed for something like a mainframe, but would be designed to run on X86 Linux or ARM Linux. Cool. Finally, there is the ecosystem and really the thriving ecosystem around cloud native software. I think the most important thing here is like, do you fit in a cloud native architecture diagram? Like if we're looking at all kind of like the boxes and arrows pointing to all the different places where data is flowing, where API interactions are, is there a box for this software? Can you slot it in somewhere? That makes sense. Then when it comes to kind of playing well with the ecosystem, I've kind of alluded to this earlier, but like APIs, the style of API, is it REST or is it gRPC is super important. Folks expect either these two types of APIs, but they also expect a lot of these APIs to be declarative so that they can build abstractions on top of them like Terraform or other infrastructure as code tooling so that they can reproduce environments, spin them up and spin them down by simply running a program. That's super powerful for a lot of the cloud native dev environments because it's actually like extending not just the operational capabilities to your production environment but to your developers as well. Finally, I kind of wanted to highlight observability data, so making sure that you're kind of playing well with the way people have operationalized the cloud, how they're inspecting and making sure that they're actually getting this value out of the cloud as they're using a bunch of tooling in the observability space, and so if your software is not producing things like JSON-formatted structured logs, open telemetry tracing, and Prometheus time series metrics, and then you don't have things like Grafana dashboards that people can reuse. It actually is going to impede the success and the value people are going to get out of your software, because they're either going to have to rebuild all those things, probably less good than if the community was all kind of sharing and contributing to one of them, or they're probably going to pick up another solution in a space that is actually inspectable so that they can understand better the performance characteristics of the software and make sure they're getting the value and the values aligning with them. So I wanted to do a case study next and just look at what a really mature project in the ecosystem looks like. So Prometheus has been around actually since before the CNCF was founded. It was a project originally started at SoundCloud by ex-Google engineers that wanted a metric system similar to the one inside of Google, so they could get better metrics over at SoundCloud, but they eventually open sourced it and it's become a community project. It has now graduated in the CNCF and is basically a staple of that category of observability I just mentioned. So let's see how it stacks up with our definitions so far. Is it packaged for the ecosystem? You bet it is. You can find anywhere. You can apt-get install it on any kind of like Debian Linux. Also has like one of the most popular container images and then the other registries. It is just everywhere. There's home charts for it. Like there's Kubernetes manifest. You name it, it's there. So many people are deploying this. There's even third-party vendors outside of the Prometheus open source team maintaining other versions of this LTS to make sure that the version they're using lasts a very long time. Can it discover things from its environment? Not only does Prometheus accept kind of environment variables as config, it also uses a config file, but it includes Kubernetes discovery. So similar to what I was describing, SpaceDB does. Prometheus will also have a service discovery mechanism where it discovers software that you're running on Kubernetes and collects metrics from basically software that has been annotated as specifically in the Kubernetes control plane. So it has a deep integration with its environment enough that it can even discover targets that it's supposed to be collecting metrics from all on its own without any configuration. It has a Kubernetes operator, probably one of the most popular ones on the planet called the Prometheus operator. This is how I recommend folks basically install and deploy Prometheus. It helps you manage all the various components of Prometheus and build production-ready deployment in very few lines of code. You can also update the versions of Prometheus you're using by simply changing a single line in that configuration. And not only is it available for you to run in that way, there's also even first-party cloud services. You can actually purchase a hosted version, a managed version of Prometheus ran by Amazon if you're running on AWS. And Google also has a similar version where you can ingest Prometheus metrics and get your stuff scraped into the native Google cloud monitoring dashboard as well. So that one gets all the marks. And this is where kind of Prometheus doesn't shine. Prometheus doesn't scale horizontally. You can shard it up, but fundamentally it has to be vertically scaled. So it really doesn't get the advantage of the elasticity aspect of the cloud, unfortunately. And this is fundamentally a design decision with Prometheus. There might be another way you could structure a time series database that would benefit more from this, but this is how it is today. Can it be ran on spot instances? I kind of, this is debatable when you deploy Prometheus in a robust way on Kubernetes. You actually run two of them. So it runs them in a stateful set with two replicas. And that's so if one goes down the other is still there and available and scraping. So in that sense, you actually can take one down and not lose any data, but you're kind of doing it by just duplicating the work. That's not ideal. So you definitely can if both of your spot instances disappear at the same time, which is actually quite common because you're going to be spinning up these machines at the same time, you're probably getting them for the same lease. In that scenario, those instances are going to disappear at the same time as well and you'll lose both of your replicas at the same time and you lose your metrics data. So you can work around this, but it really just isn't natively out of the box working to be compatible with spot instances. And finally does Prometheus run well on the cloud architecture? You bet it does. It was written and go primarily designed around AMD64. It has lots of different kind of optimizations in the code for these these platforms and use those libraries under the hood that are using assembly for these architectures and making sure like cryptography and things like that are as fast as they possibly can be on the typical cloud hardware. So does Prometheus take advantage of the ecosystem? Well it certainly has a box in the architecture diagram. It's probably the most well known component in the observability stack. Is it exposing declarative APIs? Yes, with the Kubernetes operator you get custom resources in Kubernetes that lets you basically define declaratively what you want to deploy in terms of in terms of Prometheus. It also has a spec around its API that's implemented by many, many, many other tools in the ecosystem. So it definitely is an API first design and anything that you can do you can you can do in Prometheus via an API. Finally, does it expose observability data? Well, as a observability component you bet it does. It actually scrapes itself. So it actually stores its own metrics in itself as it runs. So you actually don't have to do any configuration at all because it's already collecting its own metrics. So I in kind of summary, like I wanted to highlight Prometheus specifically to show that you can have a mature cloud native product that has been an ecosystem from the beginning. It's graduated project in the CNCF and it still doesn't really meet all the criteria we were trying to draw a hard line in the sand as to but there's something is there isn't taking full advantage of the cloud. I think the takeaway from this should be that not everything is going to be some definition of perfect but you need to adopt the things that make the most sense and align with the most value principles of your cloud adoption journey. So with that you should go forth and read the benefits of the cloud. If you're interested in what cloud native authorization looks like you can join our discord discord gg slash space tv or check out our github project github.com slash all said slash space tv. Thanks.