 Hello, everyone. Thank you for joining. I'm Isar Gilboa. I'm the CTO and one of the co-founders at FInout. We are a company that specializes in observability. We offer a wide variety of tools like virtual tagging and cost per unit and also cost per tenant. What I would like to talk about today is to share with you the journey of how we attributed our cloud cost to per tenant and to give away some takeaways in how to do that when you want to tackle that same exact problem in your environment. So let's begin with a question. Why would we even want to do that? What value do we have in attributing our cost to our different customers and our tenants? The reason we want to do that is because we want to measure what matters and I'll explain what I mean by that. Let's take, for example, a company that one year after another multiplied its cloud costs by two. Is it a good thing? Is it a bad thing? The answer for that is it depends. As you can probably realize, it depends if this company sold like 10 times more of its product and it has 10 times more customers than it's probably a good thing. And if it had no change at all in terms of its business health, then it's probably a bad thing. So here we kind of take costs and attribute it into something that for sure resembles the health of our business. And the deterioration in that is probably going to mean the deterioration in the business health. Also, it's a great tool for R&D departments and finance departments to communicate better because the entire conversation is becoming more actionable. Let's call it that in terms of what finance can do with the information and what finance can tell R&D as a lead as to where to start checking for issues. For instance, the answers they might get from R&D could be better ones because they would know how to ask the right questions. For example, why did these specific customers margin deteriorate? And then R&D could go and check and say, oh, it's because they're using much more expensive queries or things like that. And as I will show you here today, we're talking about a generic solution, not something that's specific for tenants, but something that you could apply when you want to attribute your cost to per project or per team or whatever any business logical unit that you would want to attribute to cost. Any questions so far? Great. So what are we going to get in the end of this wonderful journey? This is what we chose to display in our system, a high level display of each of our accounts and their costs, also with a call for margin to see how much we are making profit or making loss in some cases on each of our customers and to be able to drill down for each of these customers and to see their consumption of our different subsystems as we will describe later on. But it's important to emphasize that you can take this data and plug it into any dashboards or any UBI system that you are used to working with in your system. So let's begin. Let's first talk about the challenges in building this entire, let's call it story of allocating cost to business units. So yeah, it doesn't have to be complicated, but it does comprise the three main aspects and one of them is the cost that we need to put all in one place. The second is the tenant data. For instance, what is the entire list of our customers and how much does each of them use in any of our subproducts. And we are going to make a lot of use in Kubernetes metrics and to tie all of these things together and explain how the fact that we are using Kubernetes as we are, heavily dependent on Kubernetes in our solution, how it helps us and what we can do with it. And we'll show that once we have all of these three data points, how we're going to put it all into one meaningful story. Any questions before we begin talking about these three challenges? Great. So first, let's talk about getting cost data. So I assume all of you probably or maybe some of you who can't be due to regulations or restrictions are running on cloud. And usually your cloud provider provides a detailed list of your cloud costs. For Amazon, it's called the cost and usage report. For GCP, it's called the billing data set in your BigQuery. And it's usually a lot of data. For larger companies, it can be hundreds of gigabytes per day. And that's only for one cloud. Once you go multi-cloud, this problem gets more and more difficult to handle and data becomes a lot more vast. And also it has different structure, right? Because if I use Amazon, Amazon is using a different structure for its data, different from GCP or Azure or things like that. And let's also throw into that mix also the software services that we may use like Snowflake, like Datadog, which as you can see here in the example is not negligible. It's something that usually we'd like to include when allocating what's our production costs to customers. So this entire challenge is actually talking about taking all of the different sets of cost that we can't handle, that we want to allocate to logical units and put it all in one place, putting it all in one structure that we can query, and that we could ask simple questions like what is all of the costs that we want to attribute to customers. And I can share how we decided to handle this for our environment. We used Apache Spark with our wide set of connectors to many tools, and we're performing ETL operations in order to structure the entire data into a single structure. Any questions about that before I move on? Yeah. The CEO mainly, my partner. Yeah, we are a company of less than 40 employees at the moment. What do you mean by that? You mean that us fin out when we sell cost entities. It's a wide variety of decision makers, usually VPR and D or CTO a lot of times, or usually this kind of person or someone who's in the gap between the tech side and the decision making side. If you want to ask any more questions about fin outs, I'd love to after the talk. Okay. So any more questions about the costs? Great. So let's talk a bit about tenant data, and the easiest way I found to do that that gets the point in the most efficient manner is taking an example. And let's talk about Netflix as an example of using tenant data. So let's imagine the entire Netflix system was running on one single server. How would we even start thinking about attributing its cost to the different set of customers that a huge company like Netflix has? So of course, there are different customers and some cost more than others. For example, I'm a great customer for Netflix because I pay the monthly bill and consume quite a little. So let's agree that a customer of Netflix that watches double the screens than me would probably cost them double. So that's one of the base assumptions that we are going to work with and to use that in order to allocate the costs of multi-tenant systems that we cannot technologically, let's call it, break down by tags or any other means. So in real life, there are more examples that we can share that we are encountering. And that's for cyber security companies, there are the number of scans that they perform on their customers' accounts and then they can attribute the entire cost of their multi-tenant system based on the number of cans that were performed and number of gigabytes ingested for log indexing companies that that's the preferred method of attributing costs in their system. And usually this data source contains a lot more interesting data about the customers, for example, ARR. And that could enable us to talk in margins if we ingest not only the consumption but also more metadata that's interesting to show in this entire database that we're building that would also prove very helpful in the final result. And where can you usually find this sort of data? So companies usually store this in their BI systems where they store all of their entire list of customers and all of the metadata on them and the ARR and the usage in the system per month is usually there. It's often streamed into data warehouses, we've seen a lot of users choosing a snowflake to store this information in. And Salesforce is also an example of where this information may be found. And that's about it in regards of taking a set of costs that's multi-tenant that cannot be split and using metrics for that. Any questions about that before I move on to Kubernetes? Great. So let's talk a bit about Kubernetes. And the reason I talked about metrics before that is because what we are going to apply in Kubernetes is very similar to the method that I just showed you regarding metrics. Because if we think about all of the Kubernetes metrics that most of the standard solutions monitor on our system, we end up with a lot of metrics that represent usage per pod, consumption per pod, CPU consumption, memory consumption. And what we decided to do is use these metrics as a building block in order to split apart the underlying EC2 nodes that these pods were running on. That's enabling us to be able to use all of the pod's metadata as the most granular unit of cost that we are able to attribute costs to. And that was extremely useful for us because a lot of our pipelines, we are a company that has a lot of data pipelines as I will show in later slides. And this was a very crucial feature for us because we could, after applying such a solution, we immediately were able to ask questions like what was the price for data pipelines for this customer on this day? And it handled like half of the complexity of our system quite easily. Any questions about that before I move on? So essentially what I showed you here is the three challenges and how we can tackle each of them individually for costs, what we can do in order to gather all the costs and put it in one place for metrics where we can find them and what we can do with them in order to allocate multi-tenant data. And for Kubernetes, we can use these metrics and apply the same metric based approach that we saw earlier and get pod level data. So let's take an example system and bring this entire thing together. Let's take an example system. This is like a simplified version of our system. It ingests a lot of data from the left side and it has data pipelines and a data warehouse and the application. And as you can see, all of the things that are running in the system, we are running on AWS, as you can see. All of the things that are running any workload, it's in Kubernetes. Even the Spark is in Kubernetes. The application is in Kubernetes and even our data warehouse is in Kubernetes. So we decided to apply a Kubernetes solution and we figured out we were going to have to do that in order to allocate our costs. So the first thing when you want to allocate your system costs into logical units or customers is to identify which subsystems contribute to the total cost of customers. So we're going to start with that and for simplicity, we'll describe here two subsystems. One is the ETL side, the batch side, the nightly workflows that run pods per different customer and runs every day. And the second part is the data warehouse and the finout application side. And we are going to use the Kubernetes solution that I showed you earlier and apply it on that. And if from the cloud provider, we are able to get all of the EC2 nodes that for every day we're serving our system in order to do the daily workflow for these customers, after applying the Kubernetes metrics and storing the total in the same unified location I described earlier, we are essentially creating a place where I can get all of my customers' costs for a complete subsystem only because it runs in Kubernetes and I have pod label customer on that. It's that simple once you break apart your cloud bill according to a Kubernetes solution. So that side's been taken care of and now we need to move to the other part of the system. Any questions so far? Yeah. Essentially, yes. Once you apply a generic Kubernetes solution to break apart each of your EC2 nodes to the cost of each pod, you can actually ask questions like, what does this pod label cost me every day? Yeah. Sometimes it's more natural and sometimes it isn't. In this subsystem it was because we are actually running a certain pod for a specific customer every day. Yeah. We are actually using the metrics approach to take apart also other multi-tenants or API calls exactly. Yes, that's right. So we are actually taking apart the EC2 nodes and storing or centralize the, let's call it warehouse, the billing records per pod. When we query or centralize the data warehouse, we ask questions like, how much does this pod label cost me? It's no longer considered a node. It's losing the meaning of being a node and it's stored as the different pods that we're running on it. That was something that we developed ourselves by using the metrics from Prometheus or you can also use metrics from Data. But there are products that do that for you. Yes, exactly. These set of products that Kube cost is one of them. Fiannaut is one of them. They are connecting to your cloud bill, connecting to your metric systems and are able to take apart each of the nodes and to let you access queries like, how much does each pod label cost me? Yes. That's exactly what I'm gonna talk about right now and that's exactly the case in the area on the right here and that's why the Kubernetes solution does not solve that yet. Great. So I'll move on to solving that. So in such a case where even your finest granularity unit is multi-tenant, what can you do? So what you can do in that case is actually try and look at it in a bird's eye view, even a higher level to stop trying to allocate the finest granularity to per customer, but to ask yourself, okay, what is this subsystem? What's its purpose? What's representing consumption per customer? If you remember the example of Netflix earlier, what makes me a more expensive customer than other? And that's where we turn to our more generic metric solution and think what makes sense for us to attribute costs according to. And we chose query durations. Another good example that we could have chosen is amount of data scanned. If a customer is scanning 100 gigabytes per day and another customer is scanning 200 gigabytes per day, I mean in queries, not in the data problem, but when querying the system, that would make them, that would make the 200 gigabytes a customer twice as expensive as the 100 gigabytes customer. And that's the method that we apply for this right-hand side of the architecture. No, it's not... So that's how we... Sorry about the different design for this slide compared to the previous, but that solves us the entire architecture that I showed you earlier. And of course, as I mentioned earlier, we chose to present it all in dashboards that look like this, but you don't have to... But it's just raw data that you can use anywhere in your system. You can plug it to your looker or BI systems and use it whenever you like and get alerts according to that. And the plan budgets according to that, the sky is the limit. So let's sum it up. Let's sum it up. So what we can do once we have our Kubernetes utilization... See the timer moving here? Am I good on time? Okay, great. So another thing that's important to mention is that once we apply a Kubernetes solution, is that we have immediate access to see how much underutilization we have in our system. And then we are faced with two options. Either we can optimize this, like s cross of why is there under utilization? Why do we have such a high idle values and to act upon it? Or to also allocate and attribute the idle part to the customers, because we'll always have a certain amount of idle. It's part of our production. There's no way around it. And we want to have a smart way of also attributing these costs to our different customers. So that's that. So in order to conclude everything, I wanted you to get a certain amount of takeaways. How we can analyze a system when we want to attribute its cost to customers. So first we want to identify our subsystems. Then if the system is running Kubernetes, we'd probably want to apply some form of solution that is able to take apart these costs to the Kubernetes level. So we'd have access to the actual pod or namespace cost. For the multi-tenant part with s cross office, there are metrics that I can split these costs by. And essentially that would give me the entire access to taking any cost I want in the system and attribute it to per customer. So that's that. That's it. Any questions? Yes. So that's a good question. Usually data transfer occurred due to a certain operation that happened in the system. If this operation is easily attributed to a customer operation, we need to ask ourselves what this operation is. Does it happen on the demand of the customer? Does it happen in relation to how much data each customer is processing? And you see where I'm going with this to try and find the reason this operation occurred and if there is any proxy metric that we could allocate this cost according to. Any other questions? Thank you very much everyone.