 Hello, my name is Matt Ray, I'm the Community Manager for OpenCost, and I'll be your host for today's session, a deep dive on Kubernetes cost monitoring with OpenCost. We're going to be digging into the details for how OpenCost calculates the cost of workloads running on your Kubernetes clusters. Let's start with a quick introduction to the OpenCost project, if you're not already familiar. OpenCost is the open source CNCF project for monitoring infrastructure and container costs in real time. Initially, it was focused strictly on Kubernetes costs, but we've recently expanded our functionality into general cloud costs as well. We won't be covering it in this talk, but check out the website for more details on cloud costs. OpenCost is both a project and a specification for modeling current and historical Kubernetes spend and resource allocation. Let's take a look at the OpenCost specification. This is the document that was written as a vendor-neutral specification for measuring and allocating infrastructure and container costs with Kubernetes environments. It was written by a variety of experts across many companies, including KubeCost, AWS, Red Hat, Adobe, SUSE, and many others. The point of the specification is to identify how we split costs across Kubernetes workloads, and there are a lot of factors to consider, management fees, expenses from nodes, persistent volumes, attached disk, load balancers, network ingress, and egress. Cloud costs are committed to allocated costs. They've been requested from the Kubernetes cluster, so you're paying for them. From the cloud provider billing API, we get the numbers on the left. These are the raw metrics you're paying for. OpenCost allows you to view these costs by the workload aggregations on the right. You can see CPU usage by labels, or GPU by deployments, cost by namespaces, however it is you want to query your costs. All the Kubernetes primitives for aggregating containers are there and exposed in the API for querying. When a Kubernetes node is deployed on your cloud, it now has a total cost. Total cluster costs are everything associated with your Kubernetes deployment. In OpenCost, this is your Kubernetes cluster. The compute nodes it's in, the storage it consumes, and the overhead of the cluster management. Any unallocated costs on the Kubernetes cluster are cluster idle costs. You're paying for them, but not directly using them to run your applications. Workloads are the actual applications and containers allocated and running on your Kubernetes nodes. Workload costs may be resource usage costs or resource allocation costs. Resource allocation costs are generally things that have an hourly rate and resource usage costs are by the amount consumed, like storage or bandwidth. Workloads have allocated or usage costs. They may be usage costs or allocation costs, but they're directly managed by the scheduler. We want to measure these at the lowest level possible so we can track this data along any dimension. The asset costs are the individual nodes and overhead that make up your Kubernetes cluster. These are the containers, pods, deployments, persistent volumes, whatever it is in your Kubernetes cluster. Resource allocation and usage costs generally contain the nodes CPU, RAM, operating system, and potentially GPUs. These are calculated by querying the Kubernetes API, C advisor, KubeState metrics, and billing data. This is a simplification. Each of these breaks down much further in the open cost specification. So let's calculate our workload costs by answering the following questions. How long did the container run? How much CPU was allocated? How much RAM was allocated? On which node did the container run? What is the price of CPU on that node? What is the price of RAM on that node? A few quick notes on my setup. I'm running open cost 1.107.1, running on a two node EKS cluster on AWS with T3A medium instances. I'm using the default Prometheus installation. Before the presentation, I created the stress namespace and ran the stress ng application on it. After that ran for a while, I deleted the application. I forwarded ports for Prometheus, the open cost UI, and the open cost API. And I have a Google sheet that you can use if you want to run the calculation yourself. This is the manifest for the stress test we're running. It's a stress ng image. We've configured it to give us a good sample of data over time. Note that we've provided requests but not limits. Limits would make our math a lot less interesting as we'll see. First we want to answer the question, how long did the container run? Let's hop over to Prometheus and run some queries. This container metric, kubepod container status running, is from kubestate metrics. First, let's get the time range for our container. It was turned off so we'll want to get the start and end in a two hour window. We'll reuse this end time with all our queries for consistency. We're not worried about the value on the y-axis. It's always one. We start collecting values when the pod starts and stop when it ends. It's collecting every minute. Our first time is at 1803. Our end time is 1936. So one hour and 33 minutes or 93 minutes. We'll enter it into our spreadsheet as 93 minutes divided by 60, giving us the number of hours the container ran. Next we want to find how much CPU was allocated. Container CPU allocation is a metric provided by OpenCost. We'll provide the stress namespace and the stress pod to narrow it down. We'll grab the evaluation time again and set it to two hours. Looking at this output, we see the values range from our request floor of 50 milis CPUs to nearly two full cores. This behavior exemplifies the definition of resource allocation, which is the max of request and usage. We requested 50 milis CPUs, but the actual usage is much higher. Kubernetes has allocated the request size for the container, but we did not provide a limit equal to our request, meaning that the cost of this container varies over time depending on the amount of CPU usage. To make a calculation of CPU usage for a period of time, we're going to take the average of the CPU allocation. We'll plug in the equation into Prometheus for our time period and take the average over time for our two hour window. That is the average number of CPUs allocated to this container for this period of time. That provides a number that we can plug into our spreadsheet. Now we have how much CPU was allocated. Now let's get the amount of memory allocated to our container. This query is similar to our CPU query. We're running the container memory allocation bytes query for our container. We provided a request size of 50 megabytes and our stress application is going over that floor. Our containers RAM usage varies over time as well. We'll divide the results by 1024 to get kilobytes, then by 1024 again to get megabytes and then by 1024 again to get gigabytes because that's the unit used in our cost calculations. Just like we did with the CPU, we'll want the average over time. That provides another number that we can plug into our spreadsheet. We now have our average CPU and average RAM usage for the lifespan of the container. On which node did the container run is actually quite straightforward. The output from the CPU and RAM queries already contained the node the container was running on so we don't need to make any new queries. Using the node ID from our previous output, we'll plug that into the node CPU hourly cost query to find the price of CPU on that node. The node CPU hourly cost query gives us a nice flat result in the graph, but we'll take the average for consistency. It doesn't really change over time, but this will protect us if there were any pricing fluctuations from our provider. Let's put the node name into our spreadsheet and we'll put the CPU price for the node as well. The only remaining question is what was the price of RAM on the node? We can paste the node from their previous queries into the node RAM hourly cost query. Again, we'll take the average for consistency and that gives us a nice flat result in the graph as well. We plug this number into the spreadsheet and that's the last one. Once we have the numbers into our spreadsheet, we have a price for our container running over time. We have what the CPU cost and the RAM cost should be and those add up to our total cost. It's only five cents, but you're probably running a lot more than just one container. If we go to open cost, we can use a date range to narrow down our window to see our running container. First, set the breakdown to container. The date range of today is probably fine, but the URL for open cost takes the parameters for the API so we can get finer grained on what we're seeing. We can change the window to five hours to capture what we saw in Prometheus. There's our five cents. If we query the API, we can see the raw numbers used by the UI and compare the output. Notice we have our minutes of time, which reported 92 minutes instead of the 93 we saw. This can be a function of the window of the query, missing either the first or last minute of data. CPU course has the same average we recorded and there's the CPU cost with more significant digits than the UI displayed and a RAM by usage average as well as the RAM costs. Let's record these numbers into our spreadsheet for comparison. Only 92 minutes. We'll clear out the previous values and paste in from the API output, divide them by 10, 24, three times. With smaller numbers, it's hard to see the significant digits. With larger numbers, it's going to be much closer. If we're comparing results and wondering about slight differences, it may be a function of the resolution versus the data available. There may be slight differences by the window we're capturing. This could be a function of the resolution of our queries. The resolution defaults to 60 seconds or one minute, which is the default in the API calls as well. We can drop that number in our Prometheus queries to ensure that we get each data point. The numbers will get closer as we don't accidentally miss a result on an overlapping boundary. Higher resolution queries are much slower, so we tend to keep the resolution of queries higher, especially when we're looking at daily values. But we are pretty close in our results, and this makes sense because the cost of our workloads are always going to be a function of the time we're looking at since workloads and nodes come and go and the resolution of the queries we make. So to sum up, we answered these questions with queries from Prometheus. We saw how open-cost calculates the duration of the container and the CPU and RAM average usages. If we'd had GPUs, we could have calculated them similarly. The price of the CPU and RAM on the node is derived from the Billion API, and we multiply our average usage against the price by the time. These calculations are relatively straightforward. Hopefully, this expands your understanding of open-cost internals. If you're not already involved with open-cost, there are a lot of resources available to you. Check out opencost.io to get started and join the open-cost community. We're always eager to see you for folks. We're mostly on Slack and GitHub, with the fortnightly open-cost community meetings on the calendar every other Thursday at 1pm Pacific. I hope you found this informative and useful to your understanding of open-cost calculations. Thanks for taking the time to watch today. See you in the open-cost community.