 Hi, welcome to today's webinar. I'm Steven, the Solutions Architect vs. CIVIO. Today we'll discuss resource allocation in Kubernetes, some of the challenges with Kubernetes resource allocation, and how CIVIO is tackling these problems. Please place your questions in the comment section below and we'll answer them as soon as possible. Today we will start with a brief introduction to CIVIO. We will then discuss the biggest challenges of Kubernetes resource allocation and how CIVIO is solving these challenges with our application profiling module. We will dive into CIVIO's application profiling and the benefits of using CIVIO's NAV methodology, called Data Swirling, to fuel our profiling recommendations. Finally, we will demonstrate a few examples of application profiling in action. First, let me briefly introduce CIVIO. While we are focusing the discussion in automated application profiling today, the CIVIO platform also performs predictive troubleshooting for Kubernetes applications and environments. This means we are able to constantly observe cloud-data environments, detect when certain behaviors will lead to critical failures, determine the root cause of those issues, and alert users of failure events as they unfold in real-time. By taking action early, we prevent small issues from becoming catastrophes which saves companies from feeling the business impact of an outage. CIVIO plays a long endotical thing of Kubernetes and literally means life jacket in Greek. Yes, CIVIO is your life jacket on your Kubernetes voyage. Everyone is excited about Kubernetes. How could you not be? Kubernetes adds an vital automation that greatly assists in developing microservices at scale. Developers love the benefits of faster, easier, and flexible ways to build and deploy applications. Increased development speed means new product features are released and updated much faster, which is also exciting for companies as a whole. Operations teams love the ease of scaling and self-healing. This translates to an approved user experience, increased revenue, a shortened time to market, and much more. Enterprises see Kubernetes as a key to unlocking lower-cost applications by utilizing several small containers versus an entire VM or physical server. While the benefits are undeniable, the hidden secret is that Kubernetes is actually complex. Kubernetes is dynamic. Workloads are ephemeral, but they're not always treated as such. There are several layers underneath the Kubernetes abstraction layer. Each one of these layers has a lot going on, and when there is an issue, it's much more difficult to find versus when dealing with a single layer. If you're only running a few small applications, you may not feel the complexity. You definitely will feel it once you start to really use Kubernetes at scale. Most of us experience at least one of the following. First, users being affected by an issue with dynamic symptoms that make it incredibly difficult to determine a root cause because of how hard it can be to determine what's actually going on in our cluster. And the second is that we receive a massive bill from a cloud service provider. While both are important, we're going to focus on the latter today. Most companies are quick to adopt and use Kubernetes, but few give enough attention to the associated cost optimizations in the company proper cloud native application design. Developers and DevOps teams are able to stand up pods and clusters and they can build, test, and run their first applications. For DevOps teams, this early momentum breeds excitement and supports moving more applications over to At some point, that momentum slows or even stops abruptly. But why? Teams start to experience Kubernetes complexity and the business impact of unresolved issues and their impacts on neighboring applications within the cluster, like overallocated resources, which will cause other applications to throttle intermittently or even crash. As companies increase adoption, these issues compound exponentially and result in both significantly more complexity and a significantly higher cloud bill than anticipated. Manually trying to find and allocate resources is difficult and impossible to do at scale. The safe and easy option is to just add a few more resources than you think you will need. For a single application, it's really not an issue, but multiply this waste across several applications or in most cases, thousands and it becomes clear it's causing significant waste and it's definitely not scalable. The shock of a massive cloud bill prevents companies from continuing to adopt at a rapid pace and take a step back in their migration process. This delay prevents companies from achieving their technology goals and ultimately affects their customers. It's like being thrown off a ship into the ocean and being told, okay, now learn how to swim. While you're treading into water with your head sinking lower and lower, Susibio is the life jacket that lifts you back up, brings you back to the ship and allows you to continue your Kubernetes journey. As companies migrate to cloud native applications and adopt either a cloud or hybrid cloud approach, they need to start thinking about capacity. While cloud native brings a promise of smaller, more agile applications, it inherently removes a physical limitation for maximum capacity. If you're given a thousand nodes with a set of hardware, you know your maximum capacity and it's impossible to exceed it without requesting and installing new hardware. This is a natural ceiling on capacity. As much of a headache as it might be to go through the process to buy more servers, it is a natural barrier to not exceeding your spend in an IT infrastructure. Cloud service providers make this barrier much less significant and much easier to exceed and they are incentivized to make you do this. They surely have more than enough capacity for companies to grow into and allowing your IT infrastructure to grow is in their best interest as a business. The cloud enables a lot of innovation and agility, but also opens the door to much more waste. Having more space to use while not being thoughtful about resources is a double-edged sword because the more space that is used, the more likely there is going to be a higher amount of wasted resources. A recent study by Datadog found that within the past two years, the average number of pods per organization has doubled with a similar relative increase in the average number of Kubernetes host. Overall capacity is important as we want to tame costs from getting out of control, but it's also important on a per node level as well. We care about capacity per node for performance reasons. Costs at a per node level isn't really a problem, but performance can be destroyed if we're not careful. One container's wasted or non-restricted resources affect every other container running on the same node. You may have heard the term noisy neighbor before, which refers to this issue. It's actually more analogous to having a hostile neighbor who is wreaking havoc across the entire neighborhood. For example, if one container is using a large portion of resources and every other container rely on those wasted resources, then they could be throttled, killed, or not even deploying the first place. This translates to a poor customer experience and lost revenue. Okay, so we know resource allocation can present several problems for us, but why is resource allocation for Kubernetes difficult? After all, there are plenty of ways to adjust resources in Kubernetes. We can adjust a CPU, memory, member replicas, and a multitude of other levers. The problem lies within the vast variety and combination of controls. The variability grows with each application deployed on your cluster. The velocity at which applications are developed and deployed makes proper resource allocation difficult to manage. What's even more daunting is that each and every change to resource allocation affects both cost and performance for the applications you're tuning and the other applications deployed on the cluster. Imagine every application in the cluster has a piece of rope tied to it and the other applications on the cluster. One point of rope has second and third order effects across the entire cluster. Adjusting one pod's resources affects the other pods interact with it. If your back end applications are underallocated and they cause timeouts with the front end, then it causes user issues. This is only a very basic example and gets exacerbated in highly complexed scale. Watching and analyzing and adjusting resources for your cluster could easily be a full-time job for someone even with a swell cluster. It's impossible to do this manually for most companies. Now, most developers are simply guessing or overallocating to ensure they have enough resources. This is where Civio's application profiling really shines. Application profiling enables users to get live feedback and recommendations for resource allocation to fully optimize both cost and performance of applications for your entire environment with just a click of a button. This enables developers and operations teams to maintain the velocity at which they intend to deploy new applications. If you have not spent much time thinking about resource allocation, think again. It's a serious problem. Gardner even reported cloud overspend for last year was around $17.6 billion. If resource allocation is such an issue, what are companies currently doing? Let's jump into how companies solve this issue today by manually profiling each application. Profiling is the process of gathering information about a program's behavior as it executes. You profile an application to determine which areas of a program can be optimized to increase the overall performance, reduce its resource usage, and ensure stability. Application profiling tools help to simplify the process. We have discussed that there are financial and performance sets you take if you don't properly profile your applications. For example, if you underallocate resources, you could get pods being OOM killed and throttled. If you overallocate resources, you could get pods being throttled or wasting resources. If you elect to not specify any request or limits to resources, then you could have pods being OOM killed or throttled. But why do these events happen? First, let's define what CPU and memory requests limits are. A request is telling Kubernetes what my resource consumption for CPU and memory are for normal application behavior. For example, let's say we wrote an application that can digest 100,000 requests a second and it will consume 512 megabytes of memory. So I tell Kubernetes that this application that it's about to deploy is going to require at least 512 megabytes of memory. That's the request. This is why Kubernetes knows how to take an application and put it on a node that has at least 512 megabytes of memory free. If we don't set a request, then Kubernetes would just put it anywhere and hope for the best. In the case the pod gets assigned to noted with only 100 megabytes of memory available, the pod will crash if it consumes more than 100 megabytes. It would reach that 100 megabyte limit, try to consume a little bit more, and then get OOM killed. It's likely to restart again and again and again. The same applies for CPU requests, but because CPU is a rechargeable resource, the negative effects of throttling and poor application performance. The second resource allocation parameter that we're going to talk about is the limit. The limit actually instructs Kubernetes to set a limit for that process and it's actually for the control group in which the process is running on the node in which it will deploy. The application cannot cross that limit without being killed or throttled for memory and CPU respectively. The reason to do that is to protect all of the other applications on the cluster itself. Clearly, we need to be thoughtful in setting resource requesting limits for each application we deploy. Let's take a look at how we would manually profile an application. First, we collect all of the raw data we would need to make an informed decision. Second, we measure the data over a set time interval to determine how the application behaves. Finally, we adjust the code and or resources to meet the application's needs. Raw data collection can be done with a variety of scripts or tools. This is vital to get right because if we don't collect the right amount of data from real environment and that data is not granular enough, then we are wasting our time with the rest of the process because we're using bad and inaccurate data. For example, using Prometheus and Grafana is not accurate enough because it is not designed for 100% accuracy. If we were using this data for profiling, then we would be missing vital pieces of information needed to properly optimize our resources. Let's take a look at a real example of how Socivio collects and displays information versus Prometheus and Grafana. On the bottom of the screen, we have a Grafana dashboard displaying CPU consumption of Socivio's machine learning microservice that discovers new failures and their associated root cause in the Kubernetes environments. On the top of the screen, we have the exact same pod displayed at the exact same time intervals only in Socivio's live metrics dashboard. Clearly, we can see the level of detail that Socivio provides over the most commonly used tool today. If we were to profile CPU based on the Grafana dashboard, then we would say, my CPU never exceeded half a core and we would set our CPU request and limits accordingly. In reality, the CPU is much more active with the CPU reaching over half a core 15 times and even up to 1.4 cores. Profiling this application with 0.5 cores would cause consistent throttling, causing very bad application performance, timeouts, and a variety of other issues that can stem to every other pod that interacts with. Socivio's data is much different than what Prometheus is displaying. We can easily see that data granularity greatly affects how we would accurately allocate resources to this application. Imagine how this discrepancy in accurate data affects every other application running on your Prometheus cluster. The single example becomes a massive issue at scale. Using inaccurate data clearly leads to issues that are exacerbated at scale. So how do we start with accurate and granular data? How do we work with the data collectors out there today if they're not usable? Well, we don't. Socivio recognizes this issue and we opted to build our own high performance data collectors that overcame the challenges of open source data collectors. Socivio's data collectors are incredibly resource friendly and completely optimized for Kubernetes. They collect metrics at a much more granular level than any other open source tool in existence today. We speak about this in our other webinar on data swirling, which talks about Socivio's novel approach to both gathering granular metrics and also finding and preemptively detecting Kubernetes issues. With custom data collectors utilizing data swirling, Socivio collects and analyzes massive amounts of data at lightning speed. Let's briefly discuss how data swirling works as it's a key component to how Socivio is able to make use of mass amount of data in real time. At a high level, we collect everything from the kernel, OS, network, processes, applications, container runtime, and Kubernetes API. We determine what pieces of data are relevant by immediately evaluating every data point that we collect. We only analyze the currently relevant data within our set of machine learning microservices. Data is processed and analyzed 100% in memory and immediately swirled to the next machine learning microservice for processing. In parallel, live metrics are displayed in our dashboard and applications resources are profiled and optimization recommendations are provided. This is different than every other tool today because they all collect data stored on disk in a database, then process that data, and then send back results after an incident has happened or display metrics that are completed averages and frankly not usable. Also, many tools today are highly intrusive. They require instrumentation, code injection, data offloading, or a variety of other security and stability risk. Another huge difference is that every other tool overwhelms users with data, graphs, and logs. Sifting through an overwhelming amount of information is only a small part of the puzzle as it still requires a human with expertise to analyze and use this information. Socivio provides data that is already analyzed and removes the time consuming process of classifying, correlating, and analyzing information. Let's talk about a few more approaches to determining resource allocation and why they fall short. I have talked to multiple people who said they're using a vertical pod autoscaler and want to know why it's not the right solution. Well, first and foremost, it competes against horizontal scaling, which is a key principle to cloud native architecture. The entire purpose of a container is to use as few resources as possible and to be elastic or scale horizontally. By doing this, you won't run into limitations of a single machine if architected properly, and you can handle a load that is limited only by your entire infrastructure or Kubernetes cluster. Right size in your application is essentially fully automated at this point. The second issue with the vertical pod autoscaler is that if you don't set or incorrectly set the max allowed for your resources, then it will continue to add more resources to your applications without knowing or caring about the repercussions. If enough resources are added, the pod will be OOM killed and you would need to go through the process of determining the proper request and limits anyway on top of recovering from a failed application. The third issue is that you should set boundaries for your resources. These are called the min and max allowed in the vertical pod autoscaler. The question is, without accurately profiling your application, how do you get a min and max allowed? You would be relying on bad data in most cases and again, negates any time that you think you saved by relying on the vertical pod autoscaler. If those are not good enough reasons, then I'll leave you with this. There is a reason that the Kubernetes community rejected the vertical pod autoscaler project and set your own risk to use it. The other common methods used today are also not sufficient. Trial and error is okay for single application or small isolated test environment. If you have the time to manually do this for each and every version of an application to deploy and you're only utilizing Kubernetes for a few small applications, then go for it. This is usually never the case and one can argue that you should not bother using Kubernetes if it's for single application or only a handful. And of course, you're not going to use trial and error with master production deployments and furthermore, not when scaling. Trial and error simply does not work at scale. It will require an extensive amount of people to do this as a full-time job. Low testing should only be done to determine the minimum valuable settings needed to run a base unit of a microservice. This means that we need to find how small we can make a pod to handle the smallest amount of load our application will receive. The problem is that most companies, even if they claim they are experienced Kubernetes users, are performing load testing on an application and trying to determine how big they should make it. We talked about manually observing metrics already, but as a recap, it's time consuming and open source data collectors don't provide accurate enough data to use. The one thing all these solutions have in common is that even if you do get an answer of how the application is performing, we still need to know what to set our resource request and limits to. Socivial automates and streamlines the entire profiling process. We collect detailed metrics for all of your applications and feed that into your application profiling microservices. These microservices are constantly observing the application's behavior in resource consumption. There are a number of tools today which can greatly assist in resource profiling. However, Socivio has an advantage in that we leverage our state-of-the-art data swirling technology, which is our real-time and disconnected data collection and analysis architecture. Socivio can capture incredibly granular resource utilization metrics and has a much more accurate picture of what the resource utilization is and what the allocated resources should be. Be aware if you are new to Kubernetes or have not designed your applications to be cloud native, Socivio's application recommendations are geared towards those that have been working with Kubernetes and have designed applications with cloud native in mind. Of course, Socivio will alert you if your applications are not cloud native, and it will still recommend resource allocations based off best practices for cloud native design. If you do run into this warning, it's a great thing. It means there's room for improvement in your application design and we have identified it. This process makes your applications significantly more stable, robust, and scalable. One common scenario is that Socivio generates lower resource requests than a developer has allocated and thus incorrectly thinks should be allocated, which scares developers. The application will run and throttle when it hits a certain load. The proper method of allowing more capacity is to scale horizontally by adding more replicas. This ensures we're utilizing cloud native design principles to achieve unlimited elasticity with their applications. The problem is that newer or inexperienced developers who apply the lift and shift mentality are not prepared to add replicas, but rather scale up. This poor application design can lead to loss data and or poor performance. Keep in mind that this is 100% tied to poor application design. This goes against the entire point of Kubernetes and microservices. Socivio's application profile not only gives you the right recommendations, but it can make your developers better prepared for cloud native world. Let's jump into a few live examples of application profiling. Let's take a look at an overallocated application to see how we can quickly free up resources on our cluster and reduce our cloud spend. On the top of the screen, we have Socivio's live metrics on CPU consumption for a Grafana deployment. Of note, Socivio's live metrics are always free and are much more granular than what Prometheus provides. On the bottom of the screen, we have the same application during the same time period, only they are displayed in OpenShift via Prometheus's live metrics on CPU consumption. By looking at Socivio, we can see that the application's behavior is pretty erratic with several repetitive spikes in CPU. We will take a look at the memory consumption as well. We can see it's fairly consistent over the same time period. Okay, now normally we would have to start writing down the data points for CPU and memory consumption, determine how the application behaves, calculate the resources used while balancing that against the application's behavior to determine our resource request and limits. This can take some time to determine, and this is only for one application. Let's make this easy and utilize Socivio. I'll navigate to Socivio's application profile and module and see what the recommendations are. I'll filter by overallocated applications and click on Grafana. We can see that memory is heavily overallocated by a factor of about 10. Our maximum memory consumption only peaked at 55 megabytes of memory, but there is one gigabyte allocated as a limit with our quest at 256 megabytes. We were also overallocated on CPU by quite a bit. Socivio's recommendations are reflective of the actual consumption of Grafana, and the best part is we didn't have to waste time tracking and analyzing all of this information. We simply click execute recommendation, and Socivio adjusts the resource request and limits for us, saving precious resources in our cluster that can be allocated to other applications to improve their performance, or make space for more applications to be deployed on our same cluster. Let's take a look at one more application. This time, we will look at a pod that is underallocated and throttling. We'll take a look at the live metrics for the pod to better understand Socivio's recommendations. We see that the memory consumption is relatively low and there are no real concerns there. Let's take a look at CPU consumption. We see that CPU consumption is continuously hovering around the limit is likely throttling. We will take a look at the throttling page, and we can see that it is in fact throttling as the graph is at 100%. We will navigate over to application profiling and take a look at the recommendations. Of course, the recommendations for CPU resources are higher, and given that the application is constantly throttling, we're going to apply the recommendations. We simply click execute recommendations, and let Socivio adjust the resources for our application. As you can see, with a single click of a button, we're able to save cloud spend, free up resources, and increase application performance. Thank you for attending today's webinar. Are you ready to give Socivio a try for free? Follow the link on the screen to try Socivio Premium free for four weeks. There's no need to speak to any sales rep or enter any form of payment. After the four-week period, Socivio will automatically convert back to Socivio Community Edition, which is free forever. You have nothing to lose to try it out. If you have any questions or comments, please feel free to contact me at Steven at Sosiv.io. Thanks for watching.