 Hello, everyone, and welcome to Open Observability Talks. I'm your host, Doton Horvitz. And here at Open Observability Talks, we talk about anything DevOps, observability, and open source. So may the open source be with you. I'd like to thank our sponsors, Logs.io, the Cloud Native Observability Platform. Logs.io takes the best-of-breed open source projects, such as Prometheus, Open Search, and Jega, and offers them as a unified observability platform built for scale. For those joining the livestream, stream or on YouTube or Twitch, feel free to share questions and comments on the chat. It definitely makes things more interesting for us here on the fireside chat. And let's move on to today's episode. Last episode I discussed here, the challenges of monitoring Kubernetes operationally, things such as configuration complexity, high churn rate, et cetera. Today, I'd like to talk about the challenge of monitoring your Kubernetes spend. With the current financial climate, cost reduction is the top of mind for everyone. IT is one of the biggest cost centers. And companies realize that they simply don't understand the cost of their Kubernetes workloads or even have observability into basic units of cost. So we'll discuss this and the Phenops discipline for addressing this. And there's also a fascinating open source project, OpenCost, which aims to provide an open standard around that. And for this topic, I invited Matt Ray, who is the senior community manager for the OpenCost project, is also a veteran in the open source and DevOps communities, and also a fellow podcaster. Let me invite Matt to the stream. Hey, Matt, good morning. Good morning. Thanks for having me. Thanks. Glad to be here. And thanks for taking this live stream so early in your time. You're based in Australia. So it's like 6 AM now, right? Yeah, yeah. Yeah, it's early, but, you know, Montreal never sleeps. Yeah, I'm based in Tel Aviv and working a lot with the US, especially West Coast. I know the challenges, but with Australia, it can be even more interesting. And also, as I mentioned, you're a co-host of your podcaster and you're a co-host of the software-defined talk podcast. Yes. A small anecdote, by the way. Last week, I delivered talks in Belgium at FOSDEM and the Config Management Camp. And there, I ran into your co-host, Michael Cote. So I hope he's with us today on the live stream. I did invite him personally. So maybe before we go into today's episode, do you want to tell us a bit about the show? Sure, sure. So software-defined talk is a podcast that I formed with two of my friends, Brandon Wichard and Michael Cote. Each of us has kind of a different background in the enterprise software industry. Cote is, he's coming from, he's currently in like kind of marketing, but he's got a background as an industry analyst. He worked for Redmonk and 451 Group. He also worked at Dell and Mergers and Acquisitions. So he's got kind of an interesting industry background. And then Brandon has been a product manager for quite a while, different monitoring platforms. I think he worked at Boundary back in the day. He's worked at OpenID or identity-related startups. And we all work together at BMC, one of the granddaddies of monitoring. And my background is engineering and community and developer relations and kind of down that path. And so the three of us have different kind of viewpoints on how the industry works. And we've been podcasting for about seven or eight years now. And I think we hit episode 400 last week. So we're still going strong and we all live in different sides of the planet. So it's fun to bring that in. Cote lives in Amsterdam. I live in Sydney and Brandon's back in Austin, Texas. That's amazing. This show has been around for, this is the third year, but looking at more veteran shows such as yourselves is definitely a lot to learn there. So great to see that. And hopefully my followers that like podcasts will definitely find interest. I highly recommend that. And let's talk about Phenops. That's a hot topic these days. As I mentioned in the opening and with the potential recession, many organizations these days are looking for what they're spending money on and definitely cloud and infrastructure cost is typically the second highest line item after the salary cost, I think. So it's definitely top of mind for everyone. So before delving into the details, let me let's start with level setting the basics. Can you help us figure out what Phenops is all about? Sure, so Phenops is, well, there's the Phenops org, which is a foundation under the Linux foundation. So they're open source without being code-based. It's a group that came together to kind of talk about the intersection between cloud finance and operations. Everybody is, not everybody, but most people are starting to run a lot of operations in the cloud and it's a different cost model. Instead of going and buying a bunch of servers and waiting for them to be racked and eventually deploying your stuff three months later, you buy on demand. And so from the finance side of the house, that's a really different model. Instead of just buying a bunch of stuff and sitting it in your own data center, now you're renting by the hour, by the minute, by the second and having to bring that intersection of understanding how your costs are run and how they escalate, how they are managed and what the development and operations need to do with your infrastructure. That's kind of where Phenops lives. And so nobody is, it's a new practice. I mean, I was talking to my kids the other day, my son was like, oh, is AI gonna put everybody out of business? And I was like, no, you keep moving forward. The jobs of the future you don't know about today. And if you told me when I was a kid that I would be working at an open source, financial operations, monitoring platform, I would just look at you cross-eyed. And so it's a new thing, but they put on their first conference in 2019, they put out a framework, if you will, that kind of explains what they consider, how you think about these things, what you need to consider, what principles you need to adopt, things like bringing observability into your entire stack, bringing in all the stakeholders in. It's not just engineers, it's not just finance. It's lots of different folks in between. And how you track these things, how business, it makes decisions. Are you gonna spend more because you're selling more or do you need to cut costs because there's a belt tightening phase? What do you need to do? And FinOps gives you a bunch of different maturity phases and principles around it. It's a great framework if you haven't heard about this. And if you haven't started and you're in the cloud, you should start. Yeah, I think, first of all, definitely. And it's very easy just find it in FinOps.org. So it's very easy to find it online and lots of useful resources. It is under the Linux Foundation, but as you said, it's not about code rather than about, let's say, principle and guidelines. And what I found, not specifically just in the FinOps Foundation, but in general in applying FinOps. We, for example, at Logs.io where I work, we have a designated FinOps team. And when I try to explain to other people or to newcomers or others what it is about, I emphasize always that it's about maybe first and foremost about communication and about culture. That's for me, the essence about maybe breaking silos, obviously providing visibility and observability into the cost units, but also creating this ongoing conversation about cloud costs, loop these costs into business decisions and getting people to talk together, bringing business and finance side of the house to talk with engineering or with product and the normal circumstances, these organizations don't communicate so smoothly together. They also think differently, engineering with the agile, with the quick, with the fast, with the break fast, fail fast and finance with the very long formal processes of both both sides of the house need to adjust to make this happen. So that's for me. And maybe one more point that I found useful is the core principles, like the six core principles that I think are very good for those who are just starting their house. Like the collaboration is obviously one of them and ownership is another very important thing that engineering can't just say, okay, we are about building the software, someone else to take care of the cost and the infrastructure elements and things like that. Now it's an integral part, accountability, ownership built into the, baked into the organization. Obviously the observability side of things that comes with it, you can't take ownership if you don't have very clear reporting and the dashboarding and ways to see where things stand. So all of these I think principles are very useful, especially for those who are new to that. And even before getting into Kubernetes and things like that, whoever uses SaaS and uses cloud, definitely highly, highly right now. Yeah, and they've got a great O'Reilly book called, you know, Cloud Fan Ops. And second edition just came out like a week ago, two weeks ago. So if you haven't bought that book yet, you know, definitely check it out. It's by a bunch of the different authors or the authors, most of them work for the Fan Ops work at this point. And so, you know, I highly recommend that just to anybody who's getting into the space. You know, because, you know, you're gonna learn something new. And we mentioned several stakeholders involved. I mentioned like business and finance and engineering program. I'm curious from your perspective, who are the core stakeholders that you see involved in these processes? Well, so, you know, to clarify, like my day job is I work for a Kubernetes cost management platform. So we are on the more technical end of the spectrum generally. You know, a lot of my coworkers have come from some of the larger, more established cloud financial platforms. So, you know, the spaces, it's not new. I mean, you know, as soon as Amazon started letting people do S3 and EC2, you know, bills started showing up and people started having conversations with finance. And so, you know, the stakeholders over the years have, you know, it's finance, it's, you know, CTOs, CFOs, it's engineering. Everyone's trying to work that out. Most of the people I talked to before I, you know, changed roles within KubeCast to the open cost side of the house is I was working with larger, our larger customers, you know, customer spending million dollars a month on AWS, you know, that sort of stuff. And mostly, I was seeing folks from the engineering side, you know, they had been told that they were spending too much and they needed to get a handle on what was going on or they were a large enterprise where lots of different teams were consuming cloud resources and they needed to sort out, you know, charge back or showback. And, you know, for those of you who are unfamiliar with large enterprises, a lot of times you get this bill and, you know, somebody has to take responsibility internally and, you know, you might have different budgets. And so the nice version of it is showback where you show who's responsible for, you know, hey, you know, team A is 60% of the usage and team B is 40% and chargeback is when you actually have a established budget. It came from the days of enterprise software where you had all this, you know, compute internally and people had to share. And now that, you know, you have to share the bill for an external course, it's chargeback. And, you know, but with Kubernetes, a lot of it is such a black box to, you know, even to engineering. Engineering generally doesn't pay close attention to their bills there. They're not thinking, you know, oh, you know, when I'm called this function, it's gonna cost you 30 cents more a day. Nobody thinks that way. And so initially, just like the FedOps model talks about, you know, crawl, walk, run, we like to just bring in some observability, get people looking at what they're doing and, you know, you don't have to, you know, you don't have to have chargeback. You don't even have to, you know, it's showback. You're just showing people what they're using. And in some cases, we even call it shameback. Because, you know, you're like, did you really need, you know, a triple XL large to run engine X? You know, probably not. But, you know, it is costing $1.30 an hour. So, you know, it's just getting people comfortable with the idea that everything you're doing costs money somewhere. And so, you know, we like to bring that cost monitoring into the conversation. And how do you, beyond the tooling, I know that you're from the tooling side of the house, but looking at it from the end user perspective or the driver, the agent within the organization trying to drive this awareness, how do you create awareness amongst engineers to cost and to spend? Usually someone outside of engineering has noticed the bill, right? Somebody has said, you know, hey, have you noticed that, you know, last month we spent $100,000 and this month we spent $150,000? Is this gonna continue every month? You know, is this, what's this growth look like? And so, you know, somebody who's responsible for that bill goes over to engineering management usually. It's not like, you know, they call up the engine X, the DevOps team and says, hey guys, you know, can you fix this? They started at the management layer and say, you know, this is, you know, can you justify this? They're not saying stop it, you know, because clearly the business is there to deliver some purpose, you know, IT is not just running servers because they like blinking lights. They're there to, you know, deliver value. And so, you know, that conversation gets held where, you know, can we bring in just a little bit of visibility? Can we see what's going on? You know, and maybe, maybe you could say, well, you know, what last month was Christmas or, you know, Lunar New Year on our side of the planet. And, you know, there was a big rush of, you know, need for compute, you know, the big shopping season and, you know, next month's gonna be fine. And if it's not, we'll come back and revisit this, but it's usually coming from the business side. They've got concerns, you know, and if you're in a small startup, everybody's on the same team at the beginning, right? You know, you see that bill, you become aware of it, but really, as soon as you start to spend that money, somebody will probably wonder like, are we spending too much? And, you know, what we wanna do is start that conversation of like, look, here's how you're spending your money. And maybe it's just fine, but I can tell you, when I look at customer's bills, it's not fine. They're, usually there's a lot of waste, what we call it, we call it idle, right? Because you've, with the cloud, there's kind of two models of consumption. There's usage-based. So if you're doing S3, you know, well, S3 is not a great example, if you're doing like Lambda, right? You pay by the usage. You pay, every time you make a call, you pay some fraction of a cent. And, you know, if you do a million calls in one day, it's this much, and if you do five calls the next day, it's less. And that's great. But the other, the more common model is, but based on what you've allocated. You know, I say, I need to run 15 EC2 instances, and they're going to run 24 seven for three weeks. Well, I don't get to say, well, you know, they weren't really busy for part of that time. So I'm not gonna pay the full price. Like Amazon doesn't care. You have those machines allocated to you, you will pay the full price. And so if you're using them 100%, that's great for you. If you're using them 5%, that's great for Amazon. Cause they're gonna resell that capacity to someone else. But, you know, you're paying either way. And so what we would look for is that unused cost. We wanna optimize, we wanna optimize for usage. You know, if your usage is spiky, well, you know, sometimes you have to, you have to give yourself some headroom. If your usage is flat and you're paying, you know, a lot for hardly anything. Well, you can optimize that. And that's just one form of optimization. There's lots. Yeah, obviously. And you mentioned Amazon in general, the cloud providers, they do provide, first of all, you get the bill and you have some breakdown in the bill and they have their own cost tools such as, I don't know, AWS cost explorer, I think such as that. So what can you do with these and where do they fall short in your perspective? Right, right. For most people, the Amazon bill, there's two bills. There's the one page bill, which nobody really likes that bill. I mean, you know, if you show it to a CEO, they're like, what does a million dollars of EC2 mean? And there's no breakdown. And then the other bill, it's called the Cost and Usage Report. And that is the very, very, very fine grained JSON report of everything that you do in EC2 that costs money. And Amazon drops it in an S3 bucket for you and you can consume it with the tool of your choice. And all of the different billing tools that read your bill, they're gonna look at this and this is where it has your discounts. You might get savings plans or reserved instances. You might have something, discounted pricing that you've negotiated with Amazon. Well, I keep saying Amazon, but Azure, GCP, all the cloud providers do this. They have very, very fine detailed complicated bills that need to be processed. And those aren't even readable. And so that's kind of the two ends of the spectrum for you as consumer. And then Amazon provides tools like the Cost Explorer. And it says like, look, your bill was, we'll just say a million dollars. And as you drill into it, you can say, well, half of it was EC2. And then you drill into that. You can continue to go down and see like per instance, how much you were spending, which days, what you were paying for, you've got some VPCs, some storage, all those things. And you can see all the details, but it's not machine readable. It's not something that you can easily integrate on your side into your visualization tool of choice, your visualization tool of choice, whether it's Grafano or Crystal Reports or Excel, you're gonna wanna consume that and put it into your financial engine or your monitoring engine. Both of them are end points that people care about. And so that's, and they don't know anything about Kubernetes, right? And you probably don't. You don't want Amazon saying, hey, business team A did this and business team B did that. You're like, stay out of our business. So they don't know anything about what's happening inside the nodes. So they're like, when you get your EKS, your Amazon Kubernetes bill, it's EC2 with a management fee. You don't know anything more than that for which namespace was doing what, which deployments were costing you. That's completely opaque to you as an AWS customer. From your bill aspect. And that's what we do. So before getting into the tooling, and just I want to clarify, so FinOps started around the cloud costs in general. And you gave some good examples. Some people don't even understand what they have like reserved capacity, how well they utilize it or when they do on demand and how the on demand there performs and the trade-offs things such as that. Lots to do there as well. But since we started touching about Kubernetes and this is the topic of today, how is Kubernetes spend different from the cloud spend we've done so far? So I mean, Kubernetes spend for most customers just part of their bill. Most shops are not 100% Kubernetes. They're gonna have some traditional workloads that are running on cloud instances. You got some Windows machines running in the cloud. That's generally not Kubernetes. You've got some S3, some databases, beanstalk, whatever it might be. That's not the Kubernetes side of the house. And so for us, we saw that most workloads were headed this direction and not most, but a significant portion of the market and the tooling did not really serve them. And the first edition of the Cloud FinOps book didn't really cover containers and Kubernetes really at all. Because I mean, there's a brief chapter on it, but it's brief. So we kind of saw this opportunity and that's where KubeCosta and later OpenCosta, the open source component came from is we wanted to make sure that everyone could kind of give visibility into what's happening there. So what's kind of different about it is it's just not a box that had been opened by most of the cloud tools at this point. Yeah, but I do think, at least when I try to analyze the way that I did that in this organization, previous organizations that I worked in, there are some different characteristics that I do see like the difficulty to track when you're looking at the cloud costs and the shared resources, allocating the spend, you gave some, maybe you alluded to that before, allocate spend to cost per customer, per team, per different environments, things like that or tracking the cost efficiency of your Kubernetes work cloud allocations over time across different aggregations. Can you say what you've been seeing with your customer? Well, Kubernetes changes the game. For a lot of shops, moving to the cloud was with lift and shift, right? They're like, hey, now we don't have to have a data center. And they just move their workloads, they're fairly static, not very dynamic workloads into the cloud, they clean things up and for some of them, they didn't get the savings they really expected in the cloud because they didn't really change their operations, they're just now in somebody else's data center. But if you've made the transition to more of a cloud native model, which is essentially what Kubernetes is where resources are allocated on demand. You've got some applications that are going to run for maybe a certain amount of time or they're going to run wherever it's cheapest. And they don't have a lot of, that they are potentially ephemeral, they don't have a lot of state that can be killed and rerun and moved around. When you start to get into that use case, billing becomes more complicated, but also you can save a lot more money. Kubernetes allows you to potentially condense the amount of compute you're using. So now instead of having one application per instance, you can say, well, I've got a cluster running there and let's deploy a hundred applications to it and the cluster can be resized up or down and those instances will remove, will move to different compute nodes as necessary to run. And just like virtualization saved, potentially saved a lot of money by reducing the bare metal count or condensing it into more powerful servers that cost less because you had fewer of them, Kubernetes allows us that option too. And from the billing side of things, that can be a nightmare because you've got your application today, it's running on node one and Kubernetes decided that it wanted to move it over to node two, to node three, to node four. It killed some instances, redeployed it, you deployed a hot patch, you got put on node seven. That application is just running all over the cloud, all over what you're paying for. But when you get that bill, it just says 15 compute nodes and you have no idea like, well, what did team A do? What did team B do? What was, how much of that 15, how much of that 15 machines was any of the particular team? And so that changing characteristics, it makes things a little more exciting, but potentially there's savings because you can look at it and say, well, we're paying for 15 nodes, but looking at the computer, the idle across the cluster, well, we've never passed the 20% usage. Maybe we don't need 15 nodes in our cluster, maybe we can get by with 10. And that's an opportunity for savings. Or we look at that and say, hey, our Kubernetes cluster has been up for three months. We've been running 15 nodes, we could reduce the size of it or we could call up Amazon and negotiate reserved instances. We're gonna go ahead and pay for 10 nodes a month at a 50% discount for the next year. And that you can get deals like that because Amazon likes to know that you paid up front and you like to save money. And so, and your workloads, they might be really dynamic, but the base infrastructure they're running on is statically priced. And so that's comforting on the financial side of things. When things become more predictable, even if on top of it, on top of your Kubernetes, you're deploying 30 times a day, that's fine as long as the EC2 nodes just stay there and get charged the same amount. Yeah, that's from the infrastructure, but still when trying to create this accountability that we talked about before and need to attribute this attribution model per team or per customer or per environment, then the fact that the deployments and namespaces and so on are not really isolated and they actually share the underlying resources that you use, mention nodes, we can also mention the persistent volumes, the load balancers and so on, then the ability to make this attribution and then make the forecasting, the capacity planning and also negotiating based on where you expect the business to grow in the different teams and different product lines and so on becomes maybe a bit more challenging, let's put it this way. For sure, for sure. Especially a bigger enterprise. And remember that some of this is justified from business perspective, like if it's a, I don't know, a spike in incoming requests for Rihanna's Super Bowl halftime performance this week and you're in the business of the media, then it's expected, the spike is tightly related to what you deliver to your customers, the value and hopefully you know how to monetize or this is the top line KPIs, it's fine. So it's different when the cost spikes because of a legitimate business need that arises, unlike cost spikes that are just bad utilization or someone just left the EC2 testing or machine learning model running or something. We've been malicious instances, right? When you see a cost spike in a namespace and you're like, why is, what's going on over there? And we've definitely helped customers find unsecured instances or applications that have been, they start mining Bitcoin and you're like, why did this cost you $100 over the weekend when it was a dev instance that wasn't supposed to be doing anything? And that happens too. It's weird to think of the finance as a intrusion detection, but there you go. Yeah, improved more than once. That's definitely one of the common stories that people discovered Bitcoin mining after employing simple phoenops monitoring. So let's talk, I think we talked about all these vendors and one of the challenges is also that they speak slightly different language. It's thought there is a way that you can also compare your AWS to Azure, to GCP or to other bills because there is no sort of common way of communicating which for me was the way for me to explain to others about the maybe the most basic value proposition of open costs. So let's move on to open costs. This is the new open source project, new kid on the block in the phoenops in the cloud native space for sure. That recently, by the way, been accepted to the CNCF sandbox. So a big congratulations to you and the team there. So can you tell us a bit about what open cost is about? Sure. So Kubecast started, I guess about four years ago now and the two founders had come from Google. They did some other work before starting Kubecast but they were already well familiar with the open source space and started Kubecast as the Kubecast cost model as an open source project and then added additional value on top of it. And but the intention was like, let's get this out there as the standard for Kubernetes cost monitoring. Let's, they started it and COVID happened, kind of disrupted a lot of plans. But in June of last year, 2022, we announced that the CNCF had elevated the Kubecast cost model to a sandbox project and they renamed it open cost. One of the things you're not allowed to do in CNCF land is have the company with the same name as the project. You got to give up your trademarks and stuff, which is good. And so in addition to the code base, they've been working with other vendors, other individuals, other end users on writing a specification for what it means to monitor Kubernetes for cost. How you identify different types of usage, whether it's idle or allocated. And so there's both a specification, the open cost specification V1 that talks about allocation monitoring. And that's the first pass of open cost is what it does is it goes and looks at the cloud API. So it says AWS, you have four EC2 instances running and how much do they cost per hour? And the API just says, list price is this and it takes that and then it compares that to your Kubernetes usage. It says, well, we've got five namespaces and we've got this many instances. These are our workloads or pods or containers and it lets you slice and dice that. So you break down those EC2 instances by all of those Kubernetes primitives and that's essentially what you get with open cost today. There's a UI to let you explore this. The data is stored in Prometheus. So if you have a Prometheus compatible tool, we could put it in a different back end if you want. The folks over at Grafana are storing it in the mirror. People use Thanos, Cortex, Victoria metrics, you name it there. Somebody's put it in a different Prometheus compatible back end. But open cost is, it's a CNCF project. So it's a patchy license. The goal for us with open cost is just make it the, ubiquitous default monitoring stack for cost. So as soon as you spin up a cluster and a Kubernetes cluster in any public cloud, you just throw an open cost on it to keep an eye on it and then put it on the dashboard of your choice. And so that's what we're doing with open cost. And who is the, today essentially it supports both on-prem environments and also a cloud managed Kubernetes. Can you mention who is currently? Yeah, yeah. So because this was the engine of Kube cost, it already came working out of the box with AWS, GCP and Azure. There is support for on-prem pricing. So you can upload or provide through a config map a default pricing. So you can say, Hey, in my data center, we charge a dollar for, we charge a dollar per hour per core and we charge $2 per hour for RAM. Some nice simple pricing. And that's what I did in my home instances. It's like, I don't have fractional sense. I just wanna see nice round numbers for my home usage. But it lets you set that pricing. There's support for more fine-grained pricing. You can charge on GPUs, because you may be consuming GPUs for AI or whatever. And so that's in there too. So that's what it shipped with back in June. And then the open source community has started adding other sorts of platforms. So we've got some patches in for Scaleway, a European provider. We've got some Aliyun, Alibaba's public cloud. That's in there. There've been conversations with some other providers. We've got a document for how to get started. Got very friendly Slack channel over in the CNCF Slack. So come ask questions. I'm happy to help you add your public cloud. And it's not that hard, because we're not digesting the bill. So I mentioned the cost and usage report as this multi-gigabyte JSON file. Every vendor does it differently. Those files come out, they don't come out in real time. So that's one of the weird things about cloud billing is you have your on-demand cost. You're like, you look at it when you kick off your C2 instances and says, 15 cents an hour. And you're like, okay. And then maybe 48 hours later, Amazon says, well, you did have some discount savings. You had a couple of credits. You had a reserved instances. It was actually only seven cents an hour. And that might change how you look at things. And so open cost, that is a lot of complexity. Going and doing that reconciliation, parsing that bill on-demand, finding the data in there. Open cost doesn't do that yet. And that's a, I can tell you from the Kube cost engineering side of the house, that's a lot of work. Most people don't, you know, from the open cost side of the things, most people don't seem to miss that. You know, we're really just looking at like, how much does this generally cost to me? And you know, the actual numbers are less important than the direction. Don't have to make sure that our listeners understand. So essentially it's not the actual bill that factors in all the credits and the discounting thing. It's sort of a relatively static mapping. It's the on-demand price. Whether from the managed cloud provider or if you map it yourself for your own friend. Yeah, yeah. So math on top of that, right? Right, right. It doesn't, it does not parse the final bill. It's going with the list price. And you know, the list price, for people like me, I pay list price. You know, I just run some instances on my own and you know, I don't have an Amazon salesperson. For, you know, so a lot of small and medium businesses are not in any sort of negotiation. But even on places like Google with their continued usage discounts, you know, that would show up later. But that shows up like days or even weeks after you've spent the money. So you know, they're kind of retrofitting your bill afterwards. OpaCost doesn't do that. OpaCost is really looking at... You have to look at it by the way. It's a large engineering effort that at this point we're still, we're still looking at other cost sources. So, you know, right now OpaCost provides what you're, what's allocated for your Kubernetes. So the instances, the storage and networking, you know, we're giving you that cost just, you know, based off of your deployed Kubernetes cluster. Right now we're actually headed towards the direction of out of cluster costs. So you've got a remote database. You know, the database is a service that you're integrating with. You know, Kubernetes doesn't actually know anything about that. But if you, you know, want to incorporate that bill, if you want to see that with OpaCost, that's what we're headed towards, is bringing in external asset costs. So object storage, you know, S3, RDS, monitoring, you know, you want to see how much data costs you, how much logs.io costs you, bring it back to your workloads. Well, that's where we're headed, which is actually different from what Kubernetes does. So we're kind of diverging because open source is usually about people scratching their itches. And the itch that most of our users have is I have other costs that I want to bring in. People aren't, I mean, people are definitely concerned about, you know, their final bills and, you know, to sense a business to OpaCost, OpaCost is free. It's just not source. Yeah, so KubeCost is free for single clusters and it gives you 14 days of storage. So you can deploy all the KubeCost you want. And, you know, it's free until you want to do, like, federated views and, you know, more storage and stuff like that. But maybe someday they will open source their bill processing engine, but that is a enterprise beast of its own. And, you know, on a different podcast, I heard an interview with the product manager for Amazon's billing engine. And he said that he's pretty confident that the Amazon billing engine is the largest non-government billing, non-government software project in the world. I mean, think about that. You think about how much compute is, how big AWS is. And every run is running their workloads on AWS and they're generating, you know, hundreds of thousands, if not millions of metrics per second. Or no, no, no, no, per customer. And they've got millions of customers and all that data has to be stored, processed and, you know, sent to billing, you know, in a timely manner. So, you know, they are a substantial piece of AWS's infrastructure is running their own internal billing engine for everybody who's on there. And so, you know, bills are very, very, very complicated. Yeah, I can tell you, you know, my company is on the larger end of the medium size, let's say, and from what I see, the small enterprises definitely that all these packaging and all these crediting math can definitely make, move the needle and change entirely the bottom line. And you have your dedicated person, whether you're an ISV or a medium size or enterprise, but so I'm just wondering, but you're saying that the pain is less from the community at least, the demand comes more on the PaaS side of things, like platform pieces that they incorporate for data or others. Are there any other items on the roadmap for an open cost project? I mean, definitely, you know, one of our goals with OpenCost is to move out of the CNCF sandbox. So, you know, it's been open source for just about six months now. And, you know, we're trying to build up our community, get external contributors, more people active besides, you know, KubeCost employees. Last week, we announced Grafana Labs is now a contributor. You know, they are deployed on thousands of clusters. I mean, I don't know if that was their number, but I've heard the number of, you know, thousands of Kubernetes clusters thrown around, some of them very dynamic. And so they just deploy OpenCost under everything, just as, you know, a monitoring agent to get visibility into all the costs. And they're pumping it into Moomir, which is, you know, their storage backend. And, you know, their use cases, you know, they want everything on dashboards, of course. And, you know, so they've shown up, started contributing, making OpenCost more efficient. And part of the roadmap is, you know, what sort of stuff is appealing to those sorts of shops. They've got, they've brought some things. We've had, you know, folks from other cloud vendors shown up. They want to have, you know, better support for their clouds. You know, that's always on the roadmap. We, you know, OpenCost is slowly diverging from KubeCost. You know, they're different, you know, use cases. And so part of the roadmap will be things like, you know, taking, you know, taking these open source contributions, getting our own release cadence. And as we have more external contributors, more, you know, different documentation, we can move up the CNCF ladder, you know, graduate out of being a sandbox to an incubating project. And part of that is just having more external folks involved. And, you know, it's still early days, but we're having, you know, a good, healthy share of contributions from outside. Open telemetry is something that has popped up. There's some open issues, features for that. People would like to see OpenCost implement that. And that's the sort of thing that, you know, OpenCost can do faster than KubeCost, because we're a much more, you know, we're a much smaller, more efficient project. We just, you know, we need more community folks to get involved. That's amazing. And then glad to hear that you have some more people outside of KubeCost. So it's KubeCost people, but now also Grafana Lab people, any other major figures that are involved in terms of entities that got into the project? Yeah, yeah. So when we launched OpenCost, you know, I mentioned it wasn't just, it wasn't just KubeCost, you know, open source and something. We had a, the specification. We had folks from Adobe, Armory, AWS, DT, IQ, Google, you know, New Relic, SUSE, Pixie, Red Hat. You know, so those folks were all involved with the specification. And so, you know, OpenCost is both a specification and a project. And so some of those shops will be taking the specification and releasing their own implementations. And so, you know, part of that is, eventually we'll need to form like an acceptance criteria framework, you know, that looks at your implementation, ensures that you are implementing the API. You know, so it's got all of those, you know, scaffolding needs to happen. You have some of it, some of it's in place, some of it's in progress and, you know, different. And now, and today, like as we're working on the external allocation costs, you know, their different contributors are getting involved, which is great to see. And it's already, you mentioned that there's a young project, but still beyond KubeCost it is implemented in others, I think in each AIS and others, right? So let's. Yeah, yeah. Yeah, we, you know, there are folks, you know, showing up, you know, multiple, you know, lots of questions in the Slack channel. You know, Grafana is one of the largest, you know, public deployments, you know, just it fit their use, their model very well. But I'm hoping to get more names to publish. We've got a couple of partners who have, you know, added OpenCost APIs to their products. So I think Vantage is one of them. I'm just going blanks right now, but you know, look to see more OpenCost compatible. API endpoints out there, you know, especially in a lot of these Kubernetes total platforms where they're like, hey, you know, we, you know, we provide a dashboard to give you everything. Well, they, you know, they'll put the OpenCost APIs on there so you can pull your financial data out of them. Whether it's, you know, actually provided by OpenCost, the project, who knows. But, you know, that's the great thing about being a specification is you're driving a standard. Exactly. And I think you mentioned OpenTelemetry, it's a great role model to follow. And if you can also converge and do something between the projects, it's definitely going to be a force multiplier. Before we wrap up the main part of this show, can you share, you mentioned briefly before some of them, but how can people join the community conversation, learn more and get involved? Right. So, you know, the primary, the CNCF runs a Slack. So if, if you're a member of the CNCF Slack, join us over on the OpenCost channel. OpenCost.io is the website. GitHub.com slash OpenCost is where the project, the website, the helm chart, those are currently there. We have a calendar where every, every two weeks, every fortnight, we have a working group, which we kind of gather to talk about, you know, we have an agenda like, you know, what we're working on, what, you know, we need help with, you know, what people would like to see. Some people show up and they just, you know, want to talk about their issue. And some people say, you know, hey, let's, let's start this external asset working group. So we're going to have a group kicking off a new specification. And a, an example project, those are, those are about to kick off. So if you'd like to see external asset costs get added to OpenCost, you know, join up, I think, I think we're going to implement as three as the example. But once we're done, we'll have a specification and documentation and, you know, and a working example. So you can add whatever it might be that you'd like to attract in your cost monitoring and, you know, and then tie it back to your Kubernetes usage. That's, that's what we're doing over in OpenCost. Amazing. And just a note for the listeners, if you're not, even if you're not a member, an official member of CNC, if you don't need to pay anything, the Slack channel for CNCF is open for everyone. You can just open your user there. And once there, you have all the channels in the world for all the projects, one of which is OpenCost, has OpenCost and you'll be there in the conversation, but don't think that you need to be some sort of formal member or something open for everyone. And you should also join the FinOps organization. So FinOps org, they have their own Slack. There's not an OpenCost channel, but there is, there are, there's a lot of channels for different clouds, different tool sets. They're working groups for things like OpenBilling. There's a Kubernetes and containers working group that obviously we're active in. And so OpenCost is, is the only FinOps certified project and CNCF project. So we're the, we are the intersection of those two worlds. And so, you know, definitely join either or both Slack's and I'll see you there. Yeah, sibling organizations under the Linux foundation. Yes. Great, that was fascinating. And with that, I'd like to wrap up this part and with the few minutes that we have left to cover some interesting bits and some breaking news. And very happy for you to stay, stick around with me for these parts. You probably have some interesting insights, especially with your perspective and familiarity. The first one I wanted to share actually is something that I have been working on recently. It's I call it the metrics essentials trilogy. It's three articles that are meant to cover some, a lot of the common topics that I keep on encountering with users, with community members, with customers and others. So it's, I called it the phantom metrics, the expensive metrics and the unreadable metrics. Phantom metrics is why your monitoring dashboard may be lying to you a bit about the basics of how it works and what to expect of it and where the monitoring will not show you exactly where in real time happens and to take it with a relevant perspective. The expensive metrics is maybe somewhat tied to what we talked about today about why your monitoring data and build may get out of hand and associated costs with it. Like cardinality problem and other that. And the last one is about maybe some guide to effective dashboard design for DevOps type monitoring. So you're more than welcome to check it out. It's on medium and very happy to, some of that by the way ties back to topics that we've covered on this show, like with Ben Siegelman about cost of monitoring and others. So definitely if you follow this show, it will resonate, but I think it's a good summary and do share some feedback. Glad to make it a starting point for a broader discussion. The next one is a CNCF blog about what's new in Prometheus ecosystem. Things such as the agent mode, native histograms, newly added service discovery mechanisms, the prom lens project has been contributed to Prometheus and much more. So it offers a good run down list and I highly recommend you check it out. Also I highly recommend checking out the episode we had here on the show recently with Julian Pivoto, which gives much more in depth, but this blog on the CNCF blog is definitely worthwhile checking out. Another thing we mentioned here before about open telemetry. So I saw on the CNCF blog a very interesting post about migrating from open tracing to open telemetry. For those who are not familiar, open tracing is a deprecated standard, the API specification that existed before and was then merged into open telemetry together with open sensors and some other pieces. So many older implementations that used to run on open tracing now needed to migrate to open telemetry. This was a very good walkthrough on what you need to, in order to do that in a pragmatic way. Matt, have you had a chance to do some sort of a migration such as that or play around with that? Not yet, but I already had that tab open. So I'm definitely going to read that article. Yeah, it's a good one. It used to have a shim that made it very easy, but then again shielded from the real migration work that needs to be done. So I definitely advise not to take the shim path. I think even the shim is already sunsetted and not supported anymore. It's important to understand it's been, it diverged, it expanded far beyond open tracing. I mean, open telemetry project, API specifications are definitely worthwhile doing a deep migration path rather than the shallow one. So it's a good coverage on that. And also some CNCF project updates in addition to the good news around the open-cost project. We had some good news end of year about some projects like I think the most prominent ones were Argo and Flux that have graduated from the CNCF incubation. So they're now in the graduated state. And some others next month on the episode here I will have Chris, CRA as most of them know them, the CTO of the CNCF will be here with me on the show and we'll definitely be talking about the changes, the project landscape and some of his predictions. So do join us on next month's episode. I promise it'd be interesting. He's an interesting guy. And Matt, anything else that you found interesting this week or in the past 10 months? Well, I just wanted to point out open cost. We're going to be at the Southern California Linux Expo next month. So if you are in the Southern California area, definitely show up at the conference. It's North America's largest open source community conference. Scale 20X, right? Yes, Scale 20X. I'll be giving a talk. And the open cost will be sharing a booth with KubeCast. And so hopefully I'll have some stickers there if you show up. And of course, I look forward to seeing everyone at KubeConnyU if you can make that. Yeah, I'm definitely going to be there. So I look forward to meeting you finally in person now. Yeah, it's a long haul from save, but I'll be there. Yeah, sorry. Yeah. So do check it out. And by the way, there are lots of co-located events on the first day. It's been reshuffled on the, the CNCF did some bit of an organization there because it got a bit out of hand with all the colos. So now it's consolidated. For example, all the Prometheus Day and the Hotel Day and others now are consolidated to one open observability day. Not to do with open observability talks here at the show, definitely touching upon the same, the same topics. So it makes it easier just one day of all the colos together. So highly recommended also to check out the relevant colos. And look forward to seeing you, Matt, and all the others. They're highly recommended. That's great. So thank you very much, Matt. How can people reach out to you after the show? Yeah. So if, if, if you're not in Slack, if you want to get a hold of me on the email, I'm Matt Ray at kubicost.com. You know, I'm on LinkedIn and Macedon. I kind of stopped using Twitter, but I'm Matt Ray and, and those places. So I'm usually pretty easy to find Matt Ray on GitHub. So I look forward to catching up with folks. Amazing. So thank you very much, Matt, for joining me on this early time there, Australia time. It was a fascinating talk. And thank you, of course, all the listeners who joined us today on this episode, all the episode as always are made available on all the favorite podcast apps or on YouTube. So do check them out. And if you are listening to this show, to this show, to this episode on, on the, on demand, then do know that we stream the episodes live on Twitch and YouTube. So just find all the details on openobservability.io or follow us on Twitter at openobserv for updates on the next live streams and to share your comments, suggestions, news bits or anything else. And if you have something specific that you want to talk about on the show that you think that your subject matter expert on these relevant topics do feel free to submit a top proposal on openobservability.io. I'm Dutan Horvitz. Thank you very much for listening and see you on next month's episode.