 so much. So thank you. Thank you so much for the introduction. So we're going to go into a little bit of journey. You know, your cloud will never run out of resources, right? Well, we'll learn a little bit more about that. So a little bit about who your guides are today. I'm Robbie Lockman, one of two rubbies here today. And I'm a Chief Evangelist at Harness. So I run all over developer advocacy and evangelism program at Harness. And Robbie, why don't you give a little bit of background about yourself? So Robbie is ahead of our cloud optimization group here at Harness with Robbie. Maybe a quick second about yourself. Absolutely. Thanks, Robbie. So hi, everyone. Thanks very much for joining today. We have an exciting session ahead of us. So my name is Robbie, also, and I'm the head of product for cloud optimization at Harness. Prior to this, I was CEO and co-founder of Lightwing, which was acquired by Harness recently. Lightwing does intelligent cloud automation to optimize public cloud spend. And prior to that, I ran a couple of tech ventures and e-commerce enablement healthcare and consulting. So cloud cost management is a problem and space that's been very important to me first as a consumer and now from the other side as well. Because if you're a large organization, doing this right means less cloud waste and more cash flow to invest in what matters. And if you're a smaller organization, it could basically be the difference between life and death for the company. Yeah, thanks, Robbie. So let's talk about what journey we're going to be going on today. So the first thing we're going to kind of find, well, is the cloud infinite? You know, you're the lore of auto-scaling and getting capacity when you need it. Well, is it infinite? Well, there's certainly a cost for it. Also, welcome to Kubernetes, right? So all the rage these days, K8 is portable, it's ubiquitous, but is Kubernetes actually cheaper? Are you getting more density by using Kubernetes? And then also common cloud cost challenges. So we'll walk through a few patterns of how actually these costs can actually rack up very quickly. And also, we're giving you some paradigms of actually how to start combating that. So we'll give you the cost challenges and also giving you some patterns on how you can start reducing that. Also, getting a better grasp on it. And then lastly, how to embrace Phenops, paying homage to the Linux Foundation, that there's actually a foundation called the Pheno Foundation, which is a sub foundation, I'm trying to use the word foundation too much, of the Linux Foundation, which you can join and also learn more how to combat and also to report on cloud costs similar to an agile national. So the infinite cloud, let's let's talk about, well, why even use a cloud? And is it really infinite? So for some of us, if you're as old as I am, cracking and stacking, right, you remember doing this. Now as a software engineer myself, you know, I used to have little servers running under my desk, and then there's servers other part of the office, and then there's a data center somewhere else. But really, one of the things that virtualization or even cloud resources has ushered in that there's no more rack and stack. So going from this person here, racking and stacking a blade to using something like vSphere, vClient or vCaC, picture, VM, program or choice, you're able to actually build an internal cloud, right? Like, hey, you know what we have internal resources, we have a private cloud, and we're no longer subject to racking and stacking when we need some resources. Now you might say, you know what, there's a VMware tax or VM tax, but again, you're doing things via software. Now, where we know the world we live in today or the paradigm to live today, no more VM or no more VMware tax, right? So if I need a new instance going from vSphere or vCaC over to the right to the AWS EC2 console, you're able to spin up resources when you need them, right? So you're not paying a license cost per se. Just spin up a new Linux or CentOS or Ubuntu instance. You have the ability just to pay for the underlying hardware that you're using or the underlying amount of time that you're using the hardware. And this is what Ruby and I will get into a little bit, that there's some complexities in these billing dimensions, so there's no such thing as a free lunch, but this is the kind of the world in the public cloud that we live in today. But why do you go about using the public cloud, right? So there might be this whole discussion on operational expenditure, OPEX, first CAPEX, capital expenditure, but really, one of the main reasons why you would use the public cloud is this concept of I like to call time to value, right? So not only, if you take a look at just a pure hardware portion of it, it's those are low margin services for the cloud providers that, you know, they don't make terribly much money on, let's say an EC2 instance or GCE instance, but if you take a look at your organization, let's think about something like this. Let's say all this new tech stack on the left chair, you know what, we want to use Kubernetes, we want to use Cassandra's database, we want to use some streaming such as Kafka and like we want to use some machine learning such as TensorFlow. If you were, if you had to bring these technologies into your organization or you were charged just using it, well, and you had no experience in using it, let's take a look at that journey really quickly. So first thing you might do, you might buy a funny animal, a Riley or a packet book or a press book to kind of teach you about these things. Okay, let me learn from other people who've done it. Also, you might be going about getting certain certifications, right? You might become a CKA or you might, you know what, I need to go hit up Stack Overflow to learn more about certain things or your data stack certified administrator. But basically you're going through that journey to learn how to operationalize that, right? So as a software engineer, hello world is easy, the hard part is what happens in a failure. And this is the learning journey you have to go through. But now if you take, if you kind of take a look back at the cloud providers, they offer you something like a quick time to value, right? So here I have an actual stream saw out of Amazon, which is Amazon Elastic Kubernetes Service. If I needed a cluster, I actually, I use EKS a lot, that you can actually go and just enter a few details and you have a fully running EKS cluster or send it to me about SageMaker, right, for machine learning or Amazon hosted Kafka or Amazon hosted Cassandra or pick your public cloud provider of choice. They're giving you quick time to value by leveraging their operational expertise and bundling it and selling to you as a service. But that comes with a cost. And so eventually you're going to get a bill. Actually, this is an actual screenshot of my bill a few months ago that I had from AWS. And I was joking with Ravi, like we were running through the presentation. I actually don't know what the NAT gateways were. Like I was, I blatantly asked him like, I don't know if I use one. Like I only use EKS. And I maybe it's one of one EC2 instance. So I use one Kubernetes cluster or several Kubernetes clusters. And I use maybe some other Linux machine types to do some sort of jump box type of stuff. But why did my bill look like this? Well, you know, there's just a rationale behind that, right? Sometimes your bill is complex. You know, there's no such thing as a free lunch, but I'm going to dig into that really quickly. But before we dig into that, going back to my example, let's say for this bill, I was only using Kubernetes, right? Which is true. Like for the most part, I was only using Kubernetes, but let's unpack some stuff. Please save us my TKHs for Kubernetes. So some of the benefits of Kubernetes, if some folks in the audience, you know, have dabbled with it or not sure what it is. So just quickly why folks are going towards Kubernetes, your dev team and your operations team can speak the same declarative language, which is this YAML format when you build Kubernetes manifest. And so I'll role play here. Let's say Robbie was a software engineer and I'm an application infrastructure engineer. You know, we can simply declare what needs to happen in a failure or how we scale like very non-functional requirements of the application. We're speaking the same language. It's portable, right? So Kubernetes, vanilla Kubernetes running in one cloud, if it's running in your data center, it should run similarly outside the public, in the public cloud, right? And also Kubernetes, if you don't like the opinion, it's plugable. So you can change the opinion of lots of things. You don't like how the interest controller is yet a new one. If you don't like how a certain load balancing scheme is or a certain sort of placement scheme, you may replace it. And also vanilla Kubernetes itself is one of the, obviously one of the premier applications of the CNCF, sub-foundation of Lent Foundation. It is free of license costs, right? Just talking about it. But there is an outside to Kubernetes. And this is a real number. We, about five weeks ago, so our platform, when we're running things in AWS and also our second cloud provider, which is GCP, we designed things for safety here. So if there needs to be a scaling event, it will occur. But somebody, wasn't me, Robbie, I know who it is actually, but somebody ran a sample application, Hello World. But they kind of left like some of the scaling rules or auto scaling stuff out of check because they were doing like a load test. And Hello World, in about five weeks ago, cost us $15,000. This is the price of a Honda Civic. This is what it costs us, right? And so it can be quite expensive, right? And this is like Kubernetes, right? So the auto scaling can't kick off over and over. Node is getting added over and over again as density decreased and work couldn't get placed. But it was quite expensive to run that. Someone had a speaking tube come home Monday morning. But going back to this, like, hey, why did it cost so much? Well, sure, it doesn't Kubernetes scale. Oh, absolutely. If you take a look at Kubernetes, was actually needed for Kubernetes to run. Each one of these nodes are going left to right. Let's say you have, if you're unfamiliar with architecture Kubernetes, there is a controller to worker node relationship. So for example, you might have two masters, one is hot, one is there for disaster recovery. And then you have N number of worker nodes. But with this, each one of those little icons there, it's a piece of infrastructure, right? So every time there's a node, there's the East two instance, right? So that was those cheese looking things that are stacked together. That's that icon for Amazon EC2. But every time you spin something up, you're paying for it, right? And not only that's, you go back to this number here, well, I kept spinning up and up. But there's more things to that dimension, right? Not only are you paying for the underlying hardware or the underlying, like compute, there's no such thing as a free lunch or our free beer lunch beer, depending on your which way you want to swing it in the free world. You're paying for the control plane, right? So you're paying for the expertise that the cloud provider has instilled upon you to spin that stuff up. You're paying for every piece of underlying storage and compute. Not only is it the compute, but it has to be disk somewhere. You know, there's didn't run the ether, so paying for storage. Also, what gets fairly expensive is logging, right? So we can go go on and on how we actually optimize for logs. But logging incurs cost, right? Incurs storage, incurs usage of if you're using the AWS, you're playing cloud, cloud watch cost per number of rights or per tens of thousands of number of rights, you're paying for that networking, right? So hit the IO between the Kubernetes nodes, the my NIT gateway, which I, I literally still don't know why I couldn't build for that for 160 hours. I can see it's a little salty about it. But you're paying a data transfer and then just pilot on it, right? You have other services. Okay, you need to do a build, the deployment application you're paying for that. You need to have source code stored there, which creates what's, what's the build makes, which deploys, you're paying for all of that. And this is where the building complexity starts to get it. And so, what are some of the challenges of Kubernetes, right? And so this is we're going to start getting into like some of the optimizations on your on your Kubernetes workloads and your non Kubernetes workloads. So the Kubernetes is still a piece of infrastructure. It still has to be maintained. Going back five or six years ago when my team was trying to leverage Kubernetes for the first time where we're trying it out, there is a sense of this air about, Hey, you know what, let's put it on the cluster a lot of scale. But as we know, there's theoretical limits or there are actually physical limits to that. Like it's not infinite, like the public cloud, right? Your cluster resources will be exhausted. And also operationalizing it is very difficult. Like it takes several people to optimize that, right? It takes somewhat a platform engineer to say, you know what, the project up until very recently was moving extremely fast. And they've gotten better and more mature about slowing the release cadence up, but trying to make sure that you're on some version of it, how do you maintain a platform, patching it constantly. It was a challenge. And also the maturity curve is still building, right? It's still an ongoing technology that people are starting to adopt. It's not like using a single Linux instance that's, you know, there's people with 20 or 30 years of skills, battle hardened. It's the project is, you know, from 2015, right? So it's, there's still a lot of maturity getting built on. One question that I try, so I recently tried to tackle this question like, Hey, how much overhead does Kubernetes take up? So in this example, I'm just going one step in the machine science that I use for my Kubernetes workloads typically are like four CPU, 16 gigs of memory. And the typical pod, the typical resource sign I go and request it is, oh, I have these eight gigabyte resource pods, right? But what, and some very interesting things that were happening here was for myself is that, well, my box has 16 gigs in memory. So I should be able to put two of those at eight gigs, right? So some of the more experienced people here are rolling their eyes like, of course not, you know, there's overhead. Well, I forgot that. So I was only placing one at a time. So simply by simply running free M, free M, I can SSH into a worker node, you can see that, hey, you know what Kubernetes is just taking up 350 megabytes of overhead. If you were on top, you can see that's taking up about 3% of idle. So just there's overhead too, right? It's not like, hey, your operating system clearly takes overhead when it starts up. So does your container orchestration platform. It takes up overhead. And so with that, Rami, what do you, what do you kind of explain about certain things about Kubernetes, what you can track and what you should be trying? Thanks, Avi. And just to be clear for that hello world example, it wasn't me, though. So one of the challenges with Kubernetes is that everything gets amplified, right? So the problem can scale and get out of hand much faster. It's like this powerful force multiplier. And then it's critical to make sure that the, that the multiplication is positive. So from a cost and management perspective, it's a good idea to monitor Kubernetes events to closely track things like change in replica count, whether running containers are whitelisted or not, how many parts and nodes are running at any given time, what the utilize and idle resources are within a part. And at a higher level for the node capacity, what the unallocated resources are that are not claimed by any parts. And of course, finally, any anomaly in cost. And this can be up or down. So it's obvious to track cost spikes, but cost crashing could be equally important to monitor as well. So what else can you do? Another thing that can be really impactful is to orchestrate the parts and nodes both vertically and horizontally based on various metrics of usage. So scaling the count of parts in the node and the count of nodes themselves, scaling them up or down as required is an effective way to save with the resources. Right sizing the node to make sure it isn't too big or too small balances the cost and performance. And that's another helpful correction to make. Similarly for parts, making sure that the right request value is set based on historical resource utilization and current usage patterns. That's an important balance to strike again. Now running nodes on cloud access capacity is another great way to save significantly on the compute costs of those machines. Basically spot instances on AWS and Azure and they're called preemptive VMs, preemptable VMs on GCP. So think about this. So these are spot instances with the very same performance and they come at 70 to 90% cheaper costs than on-demand machines. Now the primary challenge, of course, is the lack of availability guarantees, meaning that the cloud provider can take away that instance at any time. But if you have mechanisms in place to handle spot interruptions seamlessly enough, the cost savings can be huge. And finally, we have forecasting of spends based on historical usage. Now this is always important from a cost governance standpoint to make sure that we're not exceeding what's been budgeted already. So I know there's a lot of work packed into three little bullet points, but the payoff can be well worth it, well worth the effort here. Yeah, so I was just saying, so let's look at some of the common challenges around cloud cost management overall, even outside of just the Kubernetes world. So maybe not for other things in life, but certainly for your cloud bills, less is less and less are the better. So what are some of the common challenges we may come across? Now firstly, vendor lock-in with cloud providers can prove very costly at scale. Now by design, it's extremely easy to provision and migrate resources into a cloud provider, but complicated and often prohibitively expensive to migrate out. So vendor lock-in is when you're essentially forced to continue using a cloud provider because switching away is just not practical. So it's a good idea to consider a multi-cloud strategy to make your apps portable and so on. And then next, you may have over-provisioned or under-provisioned resources that are either costing you more than they should or not giving you the performance that you need. And then we have idle and off-end resources, which could be adding to your monthly cost when really they are candidates for terminating. For example, in AWS, this could be EBS volumes that aren't attached, snapshots that are old and unused, it could be load balancers or Fargate clusters that nobody's using and so on. Now in fact, we just witnessed an example recently where there was a non-production AWS account with a ton of resources. So load balancers, EKS, Fargate clusters, etc. So these were provisioned, no one knew when or why. And a simple cleanup ended up bringing down the bill for that entire account by 40%. Can you believe that? And so that might be a high number, but this is surely a simple and worthwhile exercise for any savings number that's greater than zero. And next up, we have given the huge number of options. Making the right choices can also be a challenge. So this could be picking between reserved instances, savings plans, or spot instances for VMs that you're running. It could be choosing the right tier for multiple available options for SDBARK, EBS volumes, etc. And then you have the somewhat extreme complexity in the cloud provider billing that Ravi just touched upon. First, you have multiple services from each and every cloud provider, just AWS has over 200. And many of them have their own pricing models. So it's a lot to keep track of. And as we know, complexity leads to inefficiency. And lastly, we have the challenges around how we accurately forecast spends based on historical usage patterns and current usage, and how we can correlate them to our defined budgets. So what are some of these cloud cost management patterns? So given all the complexity and all the spend that we have. Also, as engineers, we're natural optimizers, right? So it's not only are you saving money, but potentially you're allowed to have more density too, right? So it might not boil down to me like, you know what, I'm saving number of instances, but I'm able to maybe bin pack a little bit more. But Ravi, what do you take it away about some common patterns to fight the ever creeping cost and mal utilization? Absolutely. So we can actually, we can think of a cloud cost management under three interconnected pillars, so to speak. So first we have cost visibility or cost transparency, then we have cost optimization, and then we have cost governance. As they say, you can't improve something that you aren't mentioning. So it really starts with overall accurate visibility of what services and resources are being used, who's using them, what are they using them for? Are there any resources that aren't attributed at all? Are there any that are idle or unused and so on? Also to have cost visibility into the application services and environments that are provisioned through the CICD pipelines that are being used. And then on the optimization pillar, we have right sizing, basically making sure resources are only as large as they need to be. We have committed use discounts, again, RIS savings plans, evaluating spot instances for high availability clusters, fault tolerance and stateless workloads. And then elasticity in terms of scaling the count of resources up or down based on various usage metrics. And then on the cost inventory management or asset management side, as they call them, for EC2 or VMs in general across cloud providers. It could be SD buckets, CBS volumes and snapshots, elastic IP addresses, red chip clusters, making sure that all of these, that there aren't any unallocated assets amongst all of these. And then business mapping of these resources across organizational hierarchy, which could be for the entire company, it could be for business units, teams, applications and even team members. And finally, the last pillar that we have is cost governance, essentially setting periodic budgets, which would be monthly, quarterly and yearly, and having accurate forecasts of spend against the set budgets. Yeah, so we did talk about cloud access capacity spot instances a little bit earlier on, but maybe useful, it may be useful context to spend a minute on where exactly this cloud access capacity is coming from. So the fundamental promise of public clouds are when we need more resources to service our usage, we will be provided said resources. Now, in order to fulfill that promise, cloud providers need to maintain access capacity. And until this access capacity is requested, it's idle and unmonetized. So to monetize this otherwise idle access capacity, cloud providers give us the ability to spin up these spot instances at up to 90% cheaper rates with the caveat, of course, that they deserve the right to take away that machine from you with a short notice. And that's usually under two minutes. So because you're getting the same performance with these spot instances at deep discounts, if they are a fit for your workload, meaning that if you have a higher availability cluster, if they're fault tolerant, if they're stateless and so on. And if you have a strategy in place to gracefully handle interruptions, then the cost saving benefits really are incredible. So here's a property tweet from a few years ago, which reads, AWS isn't about paying for what you use, but paying for what you forgot to turn off. And it's true for any public cloud, really. And this is one of the biggest challenges for non-production resources. And there's no surprises why it resonated with so many people out there. So let's look at this in some more detail. So we know that this is a problem, but how big is the problem? Now we're talking about non-production resources here, which could be QA staging, development, demo, R&D machines, essentially everything that doesn't service live traffic. So unlike production environments, these are used by developer teams for maybe four, five, six hours in a given work day. So you have many idle windows even during work hours and of course, non-working hours, weekends, company holidays, these environments are completely unutilized. So if you compare the four to six hours of actual usage versus the full seven, 20 hours in a month, that's 70 to 75% of the month that are actually idle, but you're still being charged by the cloud provider. So how about using a static resource scheduler that forcefully shuts down these environments, maybe after working hours and then brings them back up again at a fixed time every morning? Firstly, there's no way to statically predict idle times that occur within working hours, when your developer teams are maybe in meetings or on lunch breaks or working on other things. And then let's say the scheduler forcefully shuts down everything at 8pm. Now there's no way to access these top machines, even if you needed to. But using native cloud provider offerings such as load balancers and cloud watch metrics to detect real-time traffic and usage and performing shutdown or terminate actions automatically when resources are idle, that's a great way to avoid potentially massive waste that's spent here. So just imagine for all of the resources and news by all of us in this room today, while we're here together in this webinar, if there was a way for all of our environments to get shut down and only be brought back up again when we needed them next, I mean, that's almost like magic. And if you find a way also to run these resources on spot instances, now you're a pro. That's sort of funny, just like a personal probably half of my career, it's been like an elastic infrastructure and half of my career has been not elastic infrastructure. And what Ravi said, they're turning on is hard. So I used to share a script, I used to work for an investment bank. And so we would constantly bicker over, we used to put a web square like constantly bicker over like, hey, do you need that node? Do you need that underlying VM that's powering your application? So I used to write scripts that go with touch folders in each one of my non-production environments throughout the day. So the monitoring that the bank had would say, oh yeah, someone's accidents again, because it wasn't turning it off as hard, it was re-spinning it back up. And I, even with all the cloud native stuff or elastic infrastructure, even leveraging like an orchestra or like mezzo sort of like Kubernetes itself, I still haven't shaken that, right? Like I still like, oh, if it goes away, it's not going to be there. But Ravi, this is so true. Like turning it back on is like super hard, man. So with that, oh, actually it's my turn to talk now. So with all of the wisdom that Ravi has disposed upon us, it's that there's also this concept called FinOps, right? And so what is FinOps? It sounds a lot like DevOps or one of those monikers like somethingOps, DevSecOps, FinOps, but what is actually is FinOps, right? So there is a movement behind a particular how to optimize clouds you should try to optimize cloud spend. But similar to any sort of, let's say, paradigm shift, it's more than just a set of practices, right? It's a culture, right? It's a governance structure. It's a team. It's similar to DevOps. If you ask any DevOps pundit, can you hire one DevOps engineer and you have DevOps? No, DevOps is a culture, like you don't just get to check the box that you have DevOps. You don't check the box that you have FinOps. It really takes multiple stakeholders, right? And kind of like my definition, I got a little bit ahead of myself like, you know, FinOps is really the DevOps of finance, right? Which is interesting. It's helping multiple stakeholders from the financial teams to the operations teams, to the engineering teams, development teams. This particular pinwheel or lifecycle really looks a lot like agile, right? You're able to, quoting Ravi again, you can't optimize what you can't measure, right? So you're making sure that you're getting the right metrics to inform. And like I said before, like as engineers, we're not natural optimizers, right? We wouldn't be doing what we do if we didn't like optimize things. And so making decisions and form decisions on the data or metrics or usages that we have. And then really getting that back into, okay, we can fine-tune it. You know what, like what I learned, like, hey, there's, workloads are not being placed in these particular nodes because there's overhead. I need to, you know, I want to have density of two pods per worker node. I need to change something. You need to scale back the resource limits of the pod. Or I need to figure out to get a bigger box, right? So there's just that push and pull. And then making sure that we're able to implement that. And then also making sure the, the adjustments that we made are, are prudent, right? And so if you want to learn more about FinOps, you can head to finops.org. As hard as we're a member firm of the FinOps organization, also a member firm of the Linux Foundation. This is a sub foundation of the Linux Foundation. There's lots of resources, no matter where you are from a system engineer to a financial analyst and anybody in between or above and below. There's lots to learn at the FinOps Foundation, like funny personal for, you know, if taking my career back years and years ago, I used to butt head with the system engineers. There's probably a lot on this call, but as an application engineer and application developer, my greatest nemesis was, you know, I used to drop IP tables all the time. I'm sure some people will be rolling around like, don't do that. But I didn't know how my application communicated sad story. But as years went on, you know, DevOps run as closer together. We're, you know, kind of like having to say bulls. Currently, or up until recently, Ravi, do you want to guess who my nemesis is? That's a manager now? It's fine. It's finance, right? Like my nemesis is switched from the operations team as a leader now in the firm. It's I have these bills I need to pay and have a budget I have to set, you know, for the rest of my team to like, hey, I have forecasts, I need to pay a school. It's going to be X number of tens of thousands of dollars. My nemesis is finance, right? But aren't the same, but with FinOps, the same silos that have been brought down with DevOps is coming down with finance and organization. So with that, you know, I think this is the end of like the, our, you know, speaking to folks part of the presentation and lovely good questions. If you want to copy the slides or just learn more about how we interact with FinOps or you know, stuff that we're doing at Harness, give it a scan, give this bit here a scan. I'll take you to a site. You can grab a copy of slides if you want or sign up for certain things that we have at Harness. But I will stop sharing there and we can take a, or actually I'll keep this up if anybody wants to take a look and we can answer your question they came across. So yeah, and for, and for the audience, feel free to ask any questions you want in the Q&A section. Love, love to hear from you. Love to just chat. So Ravi and I are here to help answer any sort of questions. It doesn't have to be technology related to, it could be about the meaning of life. I think it's 42 is what the computer came back with. Ah, okay. So here's your question. How do you take, I guess it gets like how to start your career in Kubernetes or how to take your career to as Kubernetes. I can answer that one. Ravi, if you wouldn't mind. So I'm just going to go for it. There's lots of ways. Like any sort of, like any sort of technology, there's a lot of resources. You're at the right spot, the Linux foundation, right? So as a custodian of many of these projects, there's lots of ways. If you've never used Kubernetes before, this is kind of like off topic for the webinar now because I had to go through this journey. Maybe if you have access to a Windows machine or Linux in a sense, there's a project called MiniCube. So it installs the Windows with the PowerShell access now. So, but anyhow, I would take a look at something called MiniCube and then just running through some very quick manifest applications. So Kubernetes works in this console. Like at the very simple level, it's a declarative system. But what that means is that you all three manifests. It's in YAML. You say, Hey, you know what? I want this image to be accessible at this port. And I wanted to have this mini copies of itself. You can make a very simple deployment, like, you know, eight or nine lines of YAML and just deploy it and watch the magic in the terminal as it deploy. So that as you get more comfortable with that, as you'll start to figure out everything in Kubernetes is, is pluggable. Right. So it's, you can change the opinion and it took me a while to figure that out. Like, Hey, I don't like how it does this swap out things. You can modify the controller, how it operates. You can modify the, the, you can make customer resource definitions. You can change opinions. You can influence it. There's these influencing words that you can kind of like, you're flack of a better word, like influence or stuff gets placed. So, yeah, that's that's definitely a good question. Okay. So another question. How to merge a finance team and operations team, because you're definitely different business units. Ravi, you have a little bit more experience. You've seen, yeah, you have bigger bills than I do sometimes. So you have to talk to the higher ups, but what do you talk to that question? Yeah, absolutely. So I mean, typically just like with most sort of relatively newer functions. So this is, this is a sort of function on its own, while there are aspects to both things. So there are finance aspect as this operational aspect, but fundamentally, this is, this is a separate sort of role and capability that that organizations have now started having for all of the reasons that, that Ravi mentioned. So it's typically, yeah, like, like, like a skill in a department on its own. Cool. Okay, so a couple more questions about starting out with committees. Another person mentioned like, did just several other packages. Yeah, like you call there's like K3s and kinds of thanks, Chris, about that. So there's, there's multiple, like, local, local ways to start cluster, because also you can like, if you want, you can use one of the public cloud vendors too, like if there's just credits usually for like first time users, if you want to spin up an AKS instance or GK instance or AKS and that or, so there's, there's, there's two sets. There's the folks, this is going back to that question. There's two sets that there's the folks authoring the workload, which notice that you're writing a Java application while it needs to be Docker rise or used to be used to have it Docker rise around Kubernetes. And so usually the application engineer will create, at least the code that either build engineer or application during themselves can make a Docker image. And then when it runs in Kubernetes, it's container. So, but you can get a lot of pre-baked ones, right? Like in GeneX is a quintessential, you know, deploy library slash GeneX. I think, you know, they hit the rate limit in Docker Hub because so many people use that as a, as the first one. So yeah, you don't have to necessarily be an application engineer. You're just authoring that those eight or nine lines of yellow manifest. So for the other question, are there any other ways to lower a cost other than a spot instance? We give it to Ravi again. Cloud Guru here. Yeah, absolutely. So spot instance is a really good way, but it is something that's specific. Now, fundamentally, in terms of loading costs, there are a couple of ways to do that. So one is in terms of making sure that the elasticity is right. In terms of, are you running the number of, it could be instances, parts, notes, et cetera, that you actually require and no more. The other is to also look at scaling vertically. So do you have, if it's Kubernetes or the request and limits set to exactly what you need, there's no idle or unutilized capacity there. If it's a VM or if it's an EC2 instance in AWS example, are those instances right sized in the sense that are they too large so that you're actually spending more than you should be? Are they too small where you're not getting enough performance and so on? So there are a bunch of different ways, but I mean, on top of all of those, definitely spot instances where the workloads are fit is something that's useful to look at as well. Awesome. I really want to take this one. I'll steal it. So one question here is, what is a typical overhead time for dynamic scale up and down your nodes and what's, and is it acceptable for customers running real-time workloads? Oh, it depends. Everyone's favorite word in the IT world, it depends because like, hey, there's two sets. So like, as a distributed systems engineer myself, there are two things that are scaling up. It's the scaling up of the infrastructure being ready to even process a workload, and then there's the time it takes for the workload to start. So there's this concept of cold starts, which I'm going to get into. So let's say you spun up a new, I don't know, take Kubernetes out of the picture, you spun up a new EC2 node in your ensemble of workloads, you might be able to get analytics instance up and running EC2 in two minutes. From the time you said, hey, I need something to the time that you help check past that it's able to receive traffic. But then you have a blank Linux instance. So you might have an application. Well, it might be a Java application. Well, you need to get the infrastructure on there. You might, there's ways to get around that. You might have as an AMI, you might have an image at a bootstool, but Java itself has cold start times. The language is supporting it at cold start time. If you have a database, another node, there are cold start times to be able to process the transaction. And so it really depends on the workload itself. Now Kubernetes did make that a little bit faster, right? So like, hey, I can spin up a node pretty quickly, you know, the daemon spins something up really quickly, but still it's also on how quickly can assert when the final health check passes, right? So like, you know, you might be able to spin up a new, you know, place a new node in seconds, but the final health check that it's available for traffic or it's built through send or receive, it could be 60 seconds, 80 seconds, 90 seconds, a minute, two minutes, right? Just depending on what the workload is. And it's being going back to the last part of the question is the acceptable for the customer as well. As Robbie mentioned, like you have to architect around that. So you do need some access capacity or buffer. It's good distribution systems principles that is the something called the fallacies of distributed principle or the fallacies of distributed computing, you know, latency overhead, administration costs are not erased by any sort of system. It just is attributed. So thanks for that question. This might be a good one. A good one for you, Robbie. Typically, what percentage of nodes are spot instances versus on-demand serve instances? So like, you know, you might have the same answer depends, yeah. Well, so we could have a slightly more specific one here. So typically, I mean, if you are using like the native capabilities of a cloud provider, like let's take AWS, for example, with the mixed instance policy. Now, typically, what's actually suggested is that you only have up to 30% of your workloads be spot instances only to service, you know, spikes, increased usage and so on. But having said that if you have spot interruption handling in place, if you use a spot operator, either you've built something in-house or using or you're using a service like harness for orchestrating that spot instances, you can actually go ahead and run 100% of your spot instances, 100% of your nodes on spot. Because what happens there is, you know, when there is a spot interruption, then there's an alternate spot that's provisioned for you in its place. There's a fall back to on-demand that happens automatically when spot capacity is not there in the market at all. And then the spot market is continuously pulled and when the spot capacity is back up available again, it also does a reverse fallback back from on-demand to spot. So because of all of that orchestration, you can actually run 100% in spot for this particular case. Cool. Hey, I like it. That's aggressive. 100%, Ronnie. I'm excited. I'll take this one. So follow up for the previous question I answered. So are there strategies to avoiding cold starts? Well, so there's lots of strategies, right? Like given today, if you have a statement workload, you're provisioning that there's lots of discussion of like, how much does a net user add to the amount of infrastructure that you need to have, right? Number of concurrent users that you support when doing capacity planning. So it's, what is giving you the capacity? Is it adding another node to the application? Is it adding another node? Are you limiting the number of people who come in? Are you reprioritizing people? There's like dozens of ways you can take this. But like a cold start, there's certain cold starts you can avoid. For example, if the, I think the cold start that the question's being asked for is like, you have like an in remedy cash and you know, the next node has to come online. And so now I need to replicate all the key value pairs that, that, you know, it did not have or if it was a failure when it comes back, you know, there's, there's certain ways around it, like a very, very highly informed systems. You could ship, you know, a block of them at one time. But there's other things that you just can't avoid, like Java cold starts. Like, so one of the, I get a little bit so boxy, like serverless is, you know, kind of the range now. So functions out of serverless. There's still a very poor cold start times on, on certain languages, like using node versus Java, like node will handle cross Java in terms of what it takes for Java to start, right? Especially if you're having requests that are being sub seconds, you might want to look at a different, you know, different language stack. So it really diagnosed the patient. If you're looking at more, like, guessing, like a memory solution, they're, they're certainly each provider has ways to kind of get around that. But, you know, it's, it's still something to take into consideration. I think that was it for the questions. We still have a few minutes. If anybody wants to ask anything last minute, you know, on behalf of Ravi and I, you can, if you want to chat with us on Twitter, there's two rubbies on there you can add who would love to talk to you if you want. But if that's it. Yeah, I don't see any other questions. So thank you again to Ravi and Ravi for their time today. And thank you to all the participants who joined us. As a reminder, this reporting will be on the Linux Foundation YouTube page later today. And we hope you're able to join us for future webinars. Have a wonderful day. Thanks everyone. Cheers. Thank you.