 Mic sound OK? Well, pretty sensitive. I'll try not to yell, but can I stream everything good? Excellent. All right. Well, let's go ahead and dive in. Really excited to talk today about OpenCost, brand new open source project for cloud native cost monitoring. My goal today is to first introduce the project, second, give a live action demo in a Kubernetes environment, and then third, talk about some of the lessons we learned and methodology applied to the OpenCost project. So first, a bit about me. Co-founder and CEO of KubeCost, we build cost management solutions specifically tailored for Kubernetes and containers. Engineer by training, former Google product manager, where I was working on infrastructure monitoring there. If we start by just thinking about the broader Kubernetes ecosystem, it is mind blowing to me that now five plus million developers are now engaging with the Kubernetes platform. I would feel like every day I'm hearing of new and interesting deployments and applications of Kubernetes. Just this week, I've heard of Kubernetes in space, Kubernetes in automotive, and a bunch of other interesting applications. It just increasingly is feeling a lot like the Linux project itself, in that it's being modified and molded for a bunch of different applications. And we think this is for very real reasons. So tons of flexibility, control, portability, scalability, et cetera. So that is super positive. On the other side, it is really changing a lot of the dynamics around cost monitoring and cost management. And it can be challenging when you're new to Kubernetes to operate it efficiently. And this is really for two reasons. First is really technical. So you have this new layer of abstraction that is the scheduler. It is now harder to intuit why and where applications are placed. Secondly is applications and items in your Kubernetes cluster tend to be more dynamic. So pods and jobs coming up and down, nodes coming up and down, et cetera. And then lastly, and probably mostly important, is that there's way more likely to be shared resources. So the old world of doing or pre-Kubernetes monitoring cost management was you allocate a VM, you tag it for a team or an application, et cetera. And you charge that back or allocate it to that particular entity. That largely breaks down in Kubernetes. Because again, all resources are generally shared. And it is really hard, or you need to go deeper to think about allocating those fairly. And then secondly is generally we see behavioral or organizational changes. Again, these are super positive. Teams tend to be shipping more rapidly in a more decentralized way. But from a cost and optimization standpoint, it can make it more difficult to kind of manage or govern or again, even understand why costs kind of radically change. And so the sum of these complexities results in most teams not being able to answer really basic questions about costs in their infrastructure. In fact, a really interesting study by CNCF last year where approximately 70% of companies do not have accurate visibility into their cost around their containers and Kubernetes footprint. So it is for that reason that we are super proud to introduce the brand new open cost effort. So it is a new open source project for cost monitoring and cost allocation in Kubernetes environments. It is vendor agnostic. It is Apache 2.0 licensed. It is now all of two weeks new. And the project itself has two pieces here. So first, a community built spec, which is brand new. That talks about kind of the how and the why around doing accurate Kubernetes cost monitoring. And then the Kube cost team has contributed our core cost allocation, which is a going implementation of that spec for AWS, Azure, GCP, and on-prem clusters. So that meets all the requirements of the spec as of today. And if we talk a little bit about kind of the backstory and how we got here, like really proud to be one part of this amazing group of contributors that built the spec and kind of the design of the open cost project. Really amazing to see this mix of cloud providers and Kubernetes experts and end users who have been running Kubernetes at massive scale for years now. And it was also a really interesting mix of a lot of engineering backgrounds, like ourselves, but also finance or FinOps groups from some of these different companies. And what I think was most interesting about this is this group largely came together organically. And we really saw it as a function of the fact that there's just tons of ambiguity still in this space where even cloud providers had different end users asking questions about how to think about cost. A couple of these companies actually implemented their own cost monitoring and arrived at largely different answers on what a namespace or pod, et cetera cost. So it was really cool to see them come together and start to think about, again, kind of a common language here when we think about costs in cloud-native or Kubernetes environments. One thing that I am super proud to announce just happened last week. This group came together with the intention of finding a neutral home for the project. And the project was just accepted to the CNCF and it's going through CNCF onboarding, literally as we speak. So really proud to announce the completion of that onboarding hopefully here soon. Expected within kind of the next month. All right, so let's get into some of the core tenants or like foundational principles for open costs. And I think it's all anchored around the fact that this was built for enabling action, right? Enabling engineering teams or even fin-ups teams to get in there and actually manage or optimize infrastructure efficiently. So the first kind of core tenant to doing this was actually having cost visibility in real time. Again, this is a major departure from kind of pre-Kubernetes where you generally would like wait on a cloud provider bill. It may be six, it may be 24 hours later. We can give you costs by default in open costs one minute after say like a new pod or job, et cetera, has been spun up in environment. It can be configured to give you costs even faster than that. And so that is really important in terms of like actually managing efficient infrastructure, waiting to optimize infrastructure a day later or six or 12 hours later is far less impactful than if you can do it in real time. And you can also do a bunch of really cool things like now auto-scale on cost or again make either orchestration, infrastructure or application level changes based on cost. Second core tenant was really having the ability to look at costs by truly any dimension. This is really important because organizations like architect or organize their applications or microservices differently in Kubernetes, but also the ability to actually take action or have teams bought into these cost figures is truly a function of the ability to like understand how they're generated. And so when you say, you know, tell me the cost of a namespace is really common for engineering teams to say, well, why does it cost that? Having the ability to drill down into, you know, services and then namespaces or labels and then namespaces or individual pods is really powerful. And then, you know, thirdly, because OpenCost was purpose-built for Kubernetes and has, you know, native support for anything that's gonna be shared in that environment. So this is, you know, underlying shared, you know, nodes, disk, et cetera, but it's also if you have workloads in the environment that are shared by other tenants. So say you have a monitoring namespace or a logging namespace, OpenCost would support the ability to allocate those shared resources in different capacity. And I mentioned, you know, this pre-Kubernetes way of kind of using cloud tags, which from my perspective was always kind of reactive and kind of a manual exercise. Now with OpenCost, you just get kind of the natural organization of your Kubernetes workloads, whether that's again by namespace or by microservice or by controller. And this generally fits really well into just natural development workflows. So there's no major retroactive coming back and reviewing tags like there was before. I mentioned to you how it was really important to have support for the major three as well as on-prem, you know, out of the box, which we now do with the OpenCost implementation. And then I would say, you know, the last piece of this, which is kind of why we open-sourced this is already the project has a bunch of deep integrations with things like, you know, Prometheus and, you know, Alert Manager and OpenTelemetry and, you know, Cortex and other PromQL time series databases. We just see a real opportunity to enable a bunch of really cool integrations. This is open-source, one. And two, like ultimately bring this data and these insights where, you know, developers are spending their time. So already you can see this data on Kube CTL, you know, Grafana dashboards and other places that may feel really natural, you know, for your kind of day-to-day workflow. So that I was going to dive into a quick demo and just show you, you know, some of this in action. So this is a little demo environment that I have set up where there's real Kubernetes workloads running. So first we're looking at cost by namespace. I talked about how we really like haven't seen a breakdown of cost that we don't support today. So, you know, any of these dimensions from again, all the way down to container or pod, all the way up to like the cluster level and in between, everything in between. You can also do like multi aggregations, you know, with the open cost APIs. So from here, I can see that, you know, I see cost over time. I'm looking at it on a daily basis. If I look at, say, just cost today, I can see cost, which is alligator in this UI on an hourly basis. If I go down to the underlying APIs, I can see cost again, you know, in near real time or real time, depending on how you've configured open cost. And so from here, I can now see the cost of every single resource that is consumed by each namespace in this cluster. I can then kind of one click, drill into any particular namespace. And here I see every single controller that's running in this environment. So all of the daemon sets, deployments, jobs, staple sets, et cetera, that's running. Again, this is really critical for understanding, you know, why in the world this namespace costs what it does. From here, I can actually just keep going down to the individual pod level and then further, like look at truly each individual container that's running in this namespace and all the resources they're consuming and the cost of each resource. So again, this is super powerful from our experience working with thousands of teams running Kubernetes to ultimately getting to the confidence level or ability even to take action to actually impact, say, the cost of this namespace. So that's kind of one part of the demo, which is kind of, again, diving into the allocation of any cost in your environment. If we flip over to the other side, it is thinking about the underlying assets or resources that are available in your Kubernetes cluster. And here again, the open cost pod is just sitting in a Kubernetes cluster introspecting anything that becomes available right away, it starts emitting metrics for that particular workload based on the fact that it's a in an EKS cluster or GKE cluster or on-prem cluster with kind of custom pricing sheets. So here we can see this cluster has four different nodes that have been running over the last seven days. I can drill down to, again, the individual node level and see like resource level detail about what's happening with this particular asset. So again, I think kind of highlights the two different parts of the open cross model. One is, again, the individual assets and resources that are in existence or observable in your environment. And then two is the allocation of those to any aggregation or any dimension of Kubernetes workload that's available. All right, cool. So that is a quick demo of the open cost data. I'm now gonna turn to kind of talking about some of the lessons and methodology here. I will say there is another talk later this week on open costs, which is going really deep into applications and how this data can actually be applied to make really critical decisions in your environment. That's for yourself. All right, so with all this complexity in measuring costs, there's actually just two equations that you really need to solve to either implement the open cost spec or in our view, do cost monitoring really effectively in these environments. So this all starts with just thinking about the total cluster cost for your Kubernetes environment. Again, we kind of saw that at play, but that's everything that's observable from compute to network to storage, et cetera. And then this can be broken down into individual asset cost or direct cost, as well as overhead cost. So you can think about these as from like a finance perspective as like asset costs are kind of like cost of goods sold or like variable cost, whereas overhead costs are more fixed on a per cluster level. And then asset costs can be further broken down here on the right into kind of allocation-based costs and then usage-based costs. So allocation costs are those that are actually provisioned or reserved based on capacity, and it's less relevant if you're actually using them because you're getting billed for them one way or the other, where again usage costs are just pay for what you use. And then when we think about like a practical set of examples for this, if you just look like on this right chart in the left-hand side, some examples of like allocation-based costs are like node with CPU, RAM, GPU resources attached, disk, whether they're attached disk or persistent volumes, load balancers, et cetera. And the most common usage-based costs in a Kubernetes environment would just be like network egress costs, right? If you're not networking or not egressing costs across the network, then you're not actually being billed. And then the most common example of overhead costs would just be like a cloud provider's say cluster management fee, but this could be also kind of like an internal DevOps team time that is allocated to an actual cluster when you think about again kind of showback or chargeback for that particular cluster. All right. And then the other really important question or equation to like answer, which is what is the cost of each container running in the environment? So we just talked about a cluster level talking about the macro cost of everything running your ecosystem. This is then going all the way down to the bottom and looking at each individual container. And here the open cost spec and what we strongly recommend for like thinking about Kubernetes cost monitoring is taken to the fact like Kubernetes request as well as usage, right? So let's walk through a couple examples. So if a pod is best efforts and does not have a Kubernetes request applied, meaning it doesn't have resources that have kind of been allocated for it by the scheduler, costs are based on just usage only. So you pay for what you use and if you don't use anything, you don't incur any costs. So this is great from like an efficiency standpoint in the sense that like there are no idle costs associated with this workload. But it comes at the trade off of this best effort pod is generally gonna be the first to be CPU throttled. It's generally gonna be the first to be like, ohm evicted if there is a shortage of memory available. So you can think about it as like it's actually really good from a cost efficiency standpoint but there's a real trade off in terms of quality service or reliability for this particular workload, taking this path of not setting a request. Second, if you look at this middle case is a container with a request set where usage is actually less than that request. So here this container would be billed at the amount of their request of different resources. They would have some amount of like idle cost. So they would have not perfect cost efficiency because it would be some waste or idle. But again, they would have some expectations in terms of quality of service. Just given that again, the scheduler has allocated these resources for this particular pod or container. And then lastly, there's this third example where it's the same as the second example but there's a limit applied. And what we're saying here is that like limit is actually not relevant in the open cost spec. The example that we talked about was, let's say you want to borrow some money from your parents. What you actually borrow and spend is like usage where you would need to say be billed for that or repay that. If you ask, if you request them that they set aside some amount of money, then it's only fair to say they can't spend that money elsewhere. So they should be, you should be responsible for that in some capacity. But what your limit is, you may never reach is largely not relevant from our perspective in terms of what you actually spend or consume. So I would just say maybe one other example to point out is that if you think about these burstable cases, there is a scenario where you actually have usage that is far above your request. There you would actually be billed on your usage. Again, that would be totally possible if there are resources available in that environment. And so now that we've answered both of those equations, so like costs of your total environment as well as like cost of individual container, we can now put it all together and think about the cost of idle or efficiency. So again, on the left here, this is just the breakdown in cluster costs across these different dimensions we talked about from allocation-based costs and usage-based costs. So you look in the middle, that can then be split on the allocation side between idle and allocated. And then all of the allocated and usage costs are deemed workload costs. So all of the remainder would be a calculation of cluster idle. And this is actually the number one starting point when we think about applying the open costs like data to an environment to like actually do the really cool, like optimize really efficiently. We start by looking at this idle cost. It is like truly the one number that I recommend starting with. And this can be eye-opening. It is very common where we start with users and they're like 80% idle or sometimes even 90% idle in a Kubernetes environment. So if you start with one number in terms of like trying to be efficient, we definitely recommend that. All right, cool. So let's talk about some of the lessons we've learned. Again, I would say that these are lessons that we've learned over the past like four or five months building the open costs project with the like community of contributors. But these are also lessons that we've learned from like years of working with thousands of companies like optimizing their Kubernetes environment. And the first is that like real-time data actually matters. It may be less important from like a roll-up monthly reporting or quarterly reporting. But it's actually really important when you think about how you can actually apply the insights available from this data. So again, you can now do really cool things like auto-scale based on costs that largely like weren't possible before this data is available. So that, excited about the open costs project, already a number of integrations from users that we've talked to just in the past week that are in the works. So we expect like just more and more really cool like open source solutions built on top of this data. And then secondly is just having the ability to think about costs by any dimension is ultimately what enables again teams to actually do something with this data. As engineers, we tend to be pretty skeptical when you just show us like say a single cost for a microservice or a single cost for a namespace. Like our first question is often why? And then like our second and third question is also often why? So just having the ability like drill down to like, okay, show me exactly where I'm requesting these resources, right? Or show me exactly what usage is and if you can go all the way down to see advisor and show me like the source of truth you're using, that can be really powerful for getting teammates bought in and also just again, removing this ambiguity of like, well, hold on, you say this costs $100, somebody else says it costs $150. It kind of takes that off the table when you can look at this from different dimensions. And then third is that, we talked about how in VM-based world kind of pre-containers there was this like cloud tagging and mapping exercise that was done. It was often generally in my view like pretty cumbersome. Now this just kind of fits into the developer workflow. Again, however you're organizing applications, whether it's by namespace or something else, open costs would just pick that up by default. If you did want to use say like labels or annotations or something within Kubernetes LAN, it's generally really easy to use like a policy agent or some capacity to like actually gate deployment. And I would say that can be like fixed or addressed really quickly. And then we talked about how cost efficiency for at the cluster level is typically like the best place to start when either like thinking about this from like a FinOps point of view or just thinking about from like an engineering optimization exercise. Again, I mentioned how like it is not uncommon for us to see teams running at like 10 to 20% cost efficiency or 80 to 90% idle. And a lot of different applications obviously being run on Kubernetes, but I'm really surprised how common is or how much teams cluster around say like 60 to 75% idle once they spend even just a little bit of time thinking about this problem. So the net impact of that for most is like a major, major like cost reduction typically with like zero impact to performance, reliability, et cetera. So those are some of the key lessons we have learned. Wanted to just open it up for any potential questions. I know we've got a couple of people in here that have worked on UIs and integrations on top of open costs. So welcome any hard ones to it. Great question. So open costs itself is the core allocation engine that Kube cost built. Again, it was contributed to the open cost project at the time where that like community built the open cost spec. So if you look at kind of anything from the community version up to like enterprise version of Kube cost, all of the pricing data, all of the like allocation is based on open costs. And so now that open cost has been accepted to CNCF, we're going through the process of like moving that to a totally neutral home. So if you look at it today, like the open cost project still lives in the Kube cost, you know, GitHub organization. It's gonna be moving to like the new open cost GitHub organization, maybe as soon as this week, not in the next couple of weeks. If you look at like open cost IO, there are certain places where it like points back to Kube cost stock still, we're gonna like have a totally independent open cost. So going through that process now of like, again, totally removing it. Kube cost will, as far as I can see, like always build on top and truly like import the open cost library as it's core. And yeah, really excited to see what other like, projects may think about doing something similar there. Does that answer your question? Cool, cool. For example, using cost report from cloud vendor, following a following question is how do you reflect the various pricing mechanisms such as reserved instances or spot instances? Cool, great questions. So twofold, by default, when you deploy open cost, it would understand the environment you're running in. So I heard Kerr, so let's just assume like it's an AWS or EKS environment. So it would by default, as it's spinning up, go and pull public billing pricing from AWS. And so if you're running in, you know, M5, Xlarge, and US East, et cetera, it would reflect like public pricing. But then you have the ability to actually go and integrate it with your Kerr. So there you would actually reflect, say, the cost of an enterprise discount program or a savings plan application or a reserved instance, et cetera. So again, by default, kind of public billing pricing, but if you are using, again, spot or any of these other, you have the ability to like integrate it with your actual cloud account. All right, any other questions on open cost? Any takers? Okay, excellent. Then I think we can all go grab lunch. If that sounds good, if you haven't had it already. Thank you all for joining today. And yeah, let me know if I can share more. And again, I would say another cool talk or two later this week on actual applications of this data and different places. And again, like starting to think about the optimization, you know, part of the equation now that you have this kind of real-time visibility. Cool, thank you all, I appreciate it.