 Wow, hi Welcome to our talk. Why Kubernetes can't get around FinOps. I haven't expected so many in-person faces Welcome. Good afternoon My name is Manuela. I am working for liquid reply as a consultant and I'm the number crunching person doing monitoring KPI and so on and so on this lovely person next to me is my colleague Vanessa and she's also a consultant at liquid reply and our evangelist about FinOps, but what we actually love about Doing FinOps every day is that we have the ability to build and orchestrate high performing FinOps teams and bringing DevOps people from About cost management to pretty excitement But before we dive deeper into the topic we brought a small riddle for you This is a cloud bill From a caster. That's from AWS. Oh, sorry. I'm not supposed to walk that way Okay, and I would kindly like to ask you to Spot the least efficient application or workload on this cloud bill Try to can you give me a show of hands who is able to do that? I can't see so much, but I see a single hand back there. Let's have a talk later because Basically, you figured out the first reason why Kira neatest can't get around FinOps Because this cloud bill is something that your business is getting by the end of the month and This cloud bill is no longer sufficient when it's when it comes to allocating costs to your workloads and projects In fact, the latest survey of the CNCF shows that around 70 percent of the companies Fairly estimate or a don't monitor at all cost with kubernetes and Basically, that's the reason why we're here today So what we want to do is give you a short glimpse about again why you really just can't get around FinOps and show a bit based on our experience how to gain cost transparency and control and what you Everyone in this room can do in their daily practice to support FinOps practice some of you might Remember these engine previous pre cloud days where you had to go to your project manager or whatever manager To ask for resources to get your server approved the one that you really need to get your project up and running and In these ancient times The managers were actually the people producing managing and approving the costs But as we all know that changed with the cloud Now you have us engineers producing the cost So you have a lot more people than before producing costs just by the push of a button But still at the same time you have the old traditional processes that apply you have the same people like before managers finance Whatever roles they called and they still have to somehow manage the budget and They still kind of want to approve the costs and the budget to being able to govern them and manage it But so I mean it would be easy to just update these processes and based on the variable cloud cost model, but I mean To be honest the business and the finance people they don't really understand how the cloud works most of the time And Now Kubernetes comes in and that's a whole new abstraction layer on top of the cloud And that's creates that huge knowledge gap between the technical and the non-technical people because how should they ever Comprehend how Kubernetes and Kubernetes costs work when they can't even comprehend the variable cost model of the cloud And so that's exactly what Finnobs addresses Because at the end of the day we need to buy in of those non-technical people Because well they have to plan the budget somehow and we engineers don't want to do that At least I don't so Finnobs or You might remember the DevOps movement or like pre DevOps There were like developers and operations people and they didn't talk to each other and it was a huge mess So like every team was annoyed by the other team and it wasn't fun Then DevOps came brought these two teams together suddenly you had one team and it worked pretty well and Now today you have a similar situation you have DevOps teams and you have the finance and business people and Now you have to bring them again together to have a Finnobs team and Finnobs does that and at the end of the day it enables all of these people when they begin to talk to each other again to Make these spending decisions the decisions the company needs you as an engineer need to Being finally able to work on some cool features and at the end of the day It increases the business value of the cloud of Kubernetes for the whole company but To start with that you as an engineer really need to get into the head of the finance and business people and That's where you need to know or at least be able to ask the questions that they ask They want to know what are the top spending drivers? So what's the project with the highest expenses? So how efficient or inefficient is a cluster or an environment or in general Who uses what when and what's happening? and That's the whole topic of transparency and once you're able to answer those questions You can then go into that purple question and ask about where can we start optimizing and That's exactly and when you have to answer to that purple question You can finally free up money and that money could be well spent on other innovative features Or maybe a new work colleague And finally you can have like and not 80 hours of work week or something or You can finally buy some beer for your whole team Well, basically it is all about how to gain transparency When it comes to Kira Nita's and cloud cost management to then have control about these costs and there are many different ways to get there but The main strategies here to gain Cost transparency are monitoring and labeling And on the other side to gain back control right sizing and waste management Let's dive a little bit deeper into that So first of all before anyone says anything about monitoring. This is not about solely performance I'm pretty sure every one of you has at least one or many Several tools to monitor their clusters their nodes their workloads whatsoever The trick here is this is solely focusing on performance when it comes to fin-ups Every monitoring you have to do has to have the ability to link this to cost metrics So first of all, this is a different data source. So usually it's your cloud providers bill It's the agreed discounts. It's the reservations you made. It's about Scenario building how much would something cost when I do it on demand? How much would I if I if I were talking about AWS the savings plan? Your monitoring should be able to link those metrics because then you can measure not only the efficiency but can Tell how much it costs? This brings you one step closer to transparency Now that we can see costs basically we have a second problem How many workloads are running? Approximately on your notes and clusters I bet a lot and That's the issue Remember the cloud bill from the beginning I Could totally say how much I spend for an S3 bucket based on that list And I totally could tell you how much I would have to pay for an easy to instance But that still doesn't give me the ability to tell or to answer the questions Vanessa just introduced I can't tell how much a project spent I can't tell how much an environment spends. I can't compare them to each other so labeling is the key to this these three things you can see on the screen right now are Basically not new to you, but in the context of our daily work. I want to outlight them. Why? The first thing about the pot template. I had to learn by heart if you don't Label at the right spots in your configuration template you cannot monitor costs I Did it wrong when I did it the first time and I labeled the deployment Yes, that was a very stupid moment But we had to do it all over again and we have written policies and everything so is chaos So I'm going to outline that don't repeat that mistake The second thing is when it comes to labeling Your key value pairs have to be targeting cost management So about your keys what you can see at the slide right now These are the ones the community of Phenops has come up with as being the most common ones Obviously you can individualize them for your organization, but to answer the questions We we saw in the beginning its application business unit company Cost center environment and project that should be outlined when it comes to cost transparency and One last important thing about the value description cost monitoring and cost reporting that comes out of the cost monitoring is for Non-technical people so you have to keep in mind when writing your values That they are understandable comprehensible for non-technical people Because even if you set up the perfect labeling and you have it with policies and everything running if no one else Despite you understand it then you have the same issue with you're the only one knowing what runs on your clusters So now you gained Cost transparency so basically I can tell you now what your environment is producing and costs But this knowledge is nothing without Knowing how to optimize Optimizing that's the purple question the one from before And it starts by right sizing so right sizing is all about setting the right amount of Subduo memory or the right amount of resources of your cluster of your nodes of your workloads And it starts and that's actually that's the essential part of right sizing It starts by setting resource requests and limits Per default Kubernetes doesn't set any resource limits to your pots So that means like your pot can consume whatever amount of resources it wants So that means that it's our tasks task to set the resource requests and Limits to our pots and to all of them. It's not enough to just put it on one of the pots and then we can automate the whole auto-scaling magic and We have three auto scalars that I'm going to talk about and the first one is in a For a stateful workload. So when your Workload just needs a little bit of more resources for a limited amount of time As a lazy engineer you could just say okay I said those resource requests a bit higher than my pot actually needs it But I mean again, that's just a waste of resources. So don't do that On the other hand you have those engineers that are like the money-saving foxes Did just that the right amount of resources that a pot needs? But then you probably run into performance issues in the worst case and The vertical pot auto-scaler addresses exactly that issue and it automizes it so the VPA monitors the actual usage of your pot and Suggests new values for the resource requests If the pot needs it and there is even one configuration where it applies those suggested values right at the pot to the pot and it gets redeployed and that's like the Automagically happening That's for stateful workloads, but of course we have also state less workloads and that's where the horizontal pot auto-scaler is for this one again monitors the actual usage of the pots and adds or removes pots based on Yeah, the target value the target CPU or memory you're defining in the auto-scaler That was it on a pot on a workload level then we have the other level of the infrastructure level So that means notes The cluster auto-scaler does a similar thing To the horizontal pot auto-scaler before to be honest because it just adds notes or removes them But this one it's not based on the actual usage. It just it's based on The scheduling status of your pots So if the cluster auto-scaler sees okay, it's not possible to schedule a pot due to resource constraints on your note Then it brings up a new note fourth point is eliminating waste You might say okay. Yeah auto-scaler are exactly doing that and yes, you're right, but there is more to it You could add for example policies That shut down environments when you don't need them for example a deaf environment or a test environment You could set a policy to shut it down like those non non critical workloads actually over the weekend ordering of ours Now some of you might say okay. Yeah, my deaf environment. It's just $50 who cares Take a look at the picture at the bottom of the slide You see all those colorful bars and every single one of them is one engineer saying okay. It's just $50 So at the end of the day, it's all about the sum of the cost that produces And actually that's a good reference to the monitoring manual I said earlier Because with cost monitoring you have or you get back the overall view of the things and you're not blindsided by your own project Yes, thank you. So how to implement Finnobs practice in your daily work. I went to refer to the last slide first you saw those bars, so I think the first thing you should know about Finnobs is It's getting out of the bubble and seeing the big picture Because in fact when you're working for for an organization when you're working for a company, you're not the only account You're not the only developer. You're not the only project and yes, if you have resources over provide provisioned or if you have Systems running on the weekends your single account doesn't do any harm but Make the math like do this with 200 accounts do this four weekends per month do this 12 months a year This is like a lot of money. You're wasting Without any necessity so I want to highlight that Common understanding or cross-rolls. That's the first thing about Finnobs in general To understand why it's important and then it's not about like cutting costs like just generally and When it comes to monitoring and labeling I want to highlight this one because Again, if you create labels in your bubble And maybe across your team that still doesn't make sure that a business and finance Can use this labels for reports and stuff and Be that another team is doing the exact same thing So like with everything on the technical side you're doing it's the same with those things you have to agree and Collaborate working together on a standardized list and then you have to make it part of your processes and documentation the second thing and this is a nice example When it comes to naming conventions, and I know I would some of you know When it comes to naming conventions, there are two important things to know about clusters Whatever monitoring tool you're you're using since they are also for non technical people They are using names So whenever you are using the same name for two different things This can create confusion as you can see in this example behind me that happened when we were monitoring What we thought was one cluster But by the end of the month surprise double the cost were actually two clusters Again, this is not only your team or your project. This is a cross organization and projects standards The second thing and this is the other around is with labels Hi, now. It's a very tiring topic But it's very important to to have a standardized Spelling how you do things when I first started with my actual like recent client project we had Monitoring based on tags and I have I think a seven or eight different spells for environment Just for the key not the not the value just for the key. So here. It's very important again that you manage to Come up with a standard agree on it Put it in your documentation The second thing and this is something again someone from finance and management can't do is ensure functional monitoring and labeling What does that mean? It's a procedural change So whenever You create something you you have to make sure that is part of the monitoring and this is your job No one can take this away and the second thing comes with the labels as well Whenever you're doing something new you have to make sure that is part of the monitoring and labeling thing This is how you can ensure from a technical site and help provide information No one is expecting that you do the math that you do the reportings, but this is necessary so someone else can do So yesterday at our booth we had a visitor And we were discussing autoscalers and he was like, okay, that's kindergarten autoscalers everyone knows that Then we were talking about okay, how do you set your resource request do you even do that? So yeah, of course, I do and no autoscaling and Then we were like, okay, but how what's the metric? How how much CPU or memory do you set? It's like, okay. I do it to be honest gut feeling. Okay. Yeah, fine. I mean you're the engineer maybe you have a good gut feeling but Then he was like, okay, but usually I just add like 10 percent to my gut feeling just to be safe and now imagine you add 10 percent to everything to each and every single one of your pots and Suddenly you have a huge overhead again, and that's just a waste of resources So when setting resource requests initially, please do low tests or some kind of similar thing to lot of And over time you can of course improve those limits Of course, you can always start by doing that plus 10 percent, but please improve it over time and Monitoring helps you doing that and you can just iterate over the resource requests and it will get better Talking about autoscalers, it's a really cool thing and you can even combine the horizontal and the vertical pot autoscalers But the thing is both of them act on the same metrics. So both of them could Measure CPU or memory and so that's the issue when you configure both of them to Actually monitor the same metric then you will create a race condition and that will not work third point a short reminder Use the vertical pot autoscaler for state full workloads and the horizontal pot autoscaler for state less workloads and Last but not least short story of one of my projects Zeta projects with amazing genius engineers as we all are and They said the perfect amount or the perfect configuration of autoscalers as well as the perfect configuration for weekend shutdown policies Then the weekend came The policy applied the clusters went down the autoscalers bend them back up The integrated monitoring systems were blinking and the operations team was awake And that happened a few times during the night and it wasn't a very good night at least for the operations team And so when using policies and autoscalers make sure they work together They're integrated well into each other and make sure that the surrounding systems don't blink and alert everyone involved By the way, I was nasty Let's wrap it up This was really just a very short glimpse into the world of fin-ups But the key message is that it's not about cutting costs It's about an abling data-driven decisions to then be able to save costs. Yes, that's a part of it I'm Monitoring labeling right-sizing waste management. These are the key things to get there. Obviously, we have so many more things to do but the key things are this and what the best takeaway I would Invite you all to to take with you today is that the first step for the daily practice is acknowledgement Basically knowing that this is a necessity with every abstraction level we are creating from a technical side and Then to make sure from a technical setup that you help the people who needs to understand them on a second place Well, so this was it. We are very happy to be here today and We invite you to come to our booth in Pavilion 2 at SU 32 But are there any questions yet? I think we have a microphone over there. Maybe I Can So I think that this lady was first Thank you. There's a lot of correlation between Clutting cloud costs, etc. And the environmental impact. Is there a link between fin-ups? And I don't know what the next portman to green ops is Yes, definitely That's actually one of my my passion topics right now. Thanks for the question Yes, there is a correlation, but of course if you reduce costs if you're for example, right-size your instances Then of course you save CO2 but it's not like Fin-ups all of fin-ups is green ops because you have things like Discounts for example pricing discounts and you get them without right sizing anything So that doesn't have an impact on the CO2 as far as I know But of course yes, there are a lot of there is a lot of correlation between these two. There is another microphone, right? It's like thank you so much Are we working? Yeah, first off, thank you ever so much. This has been really timely For the problems I'm going through at work at the moment, so I will definitely come over and talk to you at the booth The thing I really wanted to ask about was the visualization aspects I appreciate there's probably quite a few tools out there But how you get to a position where you can actually create these reports that go out to your finance teams in a format that's actually consumable by them and then a second question, which is around The budgeting and the operationalization because we work in an environment where our finance team are quite a long way from us And how we go about doing things like budgeting and and financial preparation because we work on an annual cycle And just those two questions. Thank you Thank you very much for the question and I have so many answers to that but I try to summarize them to two main things So the very first thing is that Every monitoring tool you're choosing whatever it is has different functions based on your environment We are for example working with tools like laudability. We are working with tools like q-cost and cloud health and You name them. We met a few here The tricky part and this is where where it's getting interesting is to understand how your operation works with the client right now I'm working we Implemented the tool but we came up with a with certain individual thresholds to define. Okay when get Projects alerted when are they getting recommendations? What's the amount of money where it's worth it to actually get to people and then we started creating reports? first manually then optimize them now they get the frequent, you know feedback automated by the tool and we have Workgroups that come together and now working through those recommendations. So this was a procedural Change management coming with the tool and the second thing As I initially said our main job is bringing exactly those departments together So the projects we are working in we always have the setup finances there businesses there ITs somewhere everywhere and the main thing is you have to find allies you have to Find people who are Willing to work on that bring them to a table come up with a strategy, but I we can talk into detail Are there any more questions? Well, ah there Hi. Hello. Yeah There yeah, was there was also someone and the microphone here is missing Okay, it's here. I guess I'll go first Hi, hello. Thank you for the presentation very interesting Very good points. I was wondering. Do you have any advice regarding right sizing? Maybe regarding low tests if you found some particularly useful tools for doing that Well, the first advice I can give you is talk to your Talk to your using teams. It sounds it sounds stupid But if they know how a similar application is like live used They can give you numbers to actually do workloads because they have the marketing experience the the productive experience and then you bring them back into the teams and I would say we don't have a number because it really differs from project to environment but I think the advice I could give you is try to Even standardize that if you don't have any market response Try to think of scenarios that could happen. I don't know if you're talking about a Basic example online shop You have to be scalable and you have to test that what happens with your with your workload when you Put traffic on it a lot and The second thing is that you should use the same test parameters across the team I think that's that's the advice I could give you like from scratch without knowing anything about your project Thank you, by the way, the the the lovely person running around with a microphone is one of our colleagues and Yeah, I was wondering you were putting a lot of emphasis on the labels before But it's so you said that the labels that were there were kind of like found out by by by you to be the best But is this coming from some kind of standard from the financial financial financial applications or Or oh, thank you Or is or is that defined per per organization like freely or is already some kind of standard in the industry for labeling things in Corners so Thank you for this question Phinups is Based on the community work the Phinups foundation as part of the CNCF So everything we do we do in kind of a collaboration and in exchange And what we presented to you are the results of like a lot of people around the globe talking about and figuring out what? There is no standard yet in certain things Because we're doing like we're still exploring this fairly new topic, but these are the ones that are like Overlapping with whomever you're talking. So this is kind of the best practice we figured Okay, well, I think there's one. I'm sorry Hi, I Wanted to ask you mainly regarding labeling if you had any issue with partners especially with fast-changing teams Something if you had any experience with fast-changing teams and labeling in the part where I don't know a team changes the name How do people set them their labels? Perhaps one uses an underscore perhaps one uses a Different convention. So how do you get around perhaps that issue? Three things about that the first thing is as I said that agreed standard So when we did this in my recent client project We set together and we had a look of how they the projects do that in general like do they Pascal or whatever kind of label usage and Based on that we developed the labels we agreed on them and we send them out to all the projects and we said like okay Listen, you have like a saying you have like I don't know two days or whatsoever to give feedback if you're okay with that Or not and then we this was kind of agreed since we took everyone on board and then we have this list documented It's like part of documentation for deployment in every single project. So it's it's agreed standard and and What we are working on right now is policies that enforce that with every resource deploy so basically using the same labels we defined manually and then Put them into policies, but there are also tools That can add Virtual text they're like different ones It's always a bit depends on how how big your environments in how many projects you have plus How much money you want to spend? Any more questions. Thank you very much. Thanks for your time