 Well, hello everybody and welcome again to another OpenShift Commons briefing. This time we have Unibuck and their CTO Fritz Bristl and one of their engineers, Stefan Haas. And we're going to be talking about virtual multi-tenancy and running non-container workloads on OpenShift. I know this works in generic Kubernetes as well, but we're really pleased to have this approach being explained to us. And we're going to learn all about how their offering NAV opt helps us to make this a reality. So without further ado, I'm going to let Fritz and start us off and get us introduced to what Unibuck has been doing and let you take it away. Thank you, Diane. I hope you can hear me. And thanks for having us. So jumping right into the first slide here, just by way of introduction, very briefly, who we are. So I'm Fritz Bristl. I'm the CTO at Unibuck. I've been around the block for quite some time, pretty much always focused on workload and resource management in distributed environments. And more recently, I have been focusing my attention onto container orchestration. So mainly Kubernetes, actually, but also a little bit on Swarm, MISO, and so on. And Stefan has been an engineer with us with a pretty long tenure as well. He has been working for some time in our core technology, which I will briefly introduce a little bit later. And now has been switching over to working on our container-facing product that is called NAVops. And he has been instrumental in integrating NAVops with OpenShift and also doing this mixed workload support that we will be seeing in the demo a little bit later today. Also very briefly, by way of introduction, who Univa is. So we really are focused around allowing customers using large shared infrastructure for any type of workload, be that containerized or not containerized. We have offices based in Chicago, in Canada, and in Germany. And we're really focused around enterprise customers, Fortune 500 companies are really mainly our customer base, of which I have a bunch of logos here on the next slide. Oops, one slide too much. We are really addressing wide breadth of markets, and in those markets we usually have the biggest companies, biggest clusters in that market that are being driven by our core technology. That is not a recent technology, it actually has been around for more than 15 years called Grid Engine. If people here remember Sun Grid Engine, then that's where it came from. That technology drives some of the biggest clusters, so some of them for instance here have many 100,000 cores and run very business critical types of applications for these types of companies. And again, it goes really across different markets, very diverse applications. And for the mixed workload support, that is also one of the pieces that we're going to use because all of these applications, all of these workflows are already integrated with our core technology and that makes it much easier for customers to utilize the solution then, but more on that a little bit later. So let me first jump into what we're doing in container land, specifically in Kubernetes land. We have been always get two slides advanced, sorry for that. We have been creating a product suite which we call NavOps and the NavOps command. One of the products in that suite is our corner product in that space and it really again, as we usually do, focuses on workload and resource management. In particular, it provides virtual multi-tenancy by that we mean teams, projects, whatever organizational breakdown of your workloads you have can share a single or just few clusters and make a combined use of it and drive up utilization of that infrastructure and get better return on investment. Then we also, to drive that further, provide the ability to run mixed workloads. By that we mean containerized and non-containerized workloads on exactly that same infrastructure. Again, to drive up utilization and make it easier to integrate into existing environments or to migrate from existing environment into container-facing architectures. We also manage scarcity with it. By that again, we mean if you have different applications, different workflows, different projects, teams that compete for resources and there is not enough to do all of them at the same time, then we handle a prioritization, service level agreements automatically in order to get the most important work done at the right time and give it access to the right resources. Of course, you don't do that just out of fun. You do it for a reason and that reason is a better return on investment on your infrastructure. We have been talking to customers who are in the process of adopting Kubernetes big time or OpenShift. In most cases, actually, when you talk to commercial customers and some of those customers, for instance, were planning to create many, many dozen of OpenShift clusters for the different projects, for the different stages of the project, so death, test, production stage and so on. If you have many dozens of those clusters, then of course that drives inefficiency for one thing to maintain all those different clusters to make sure they're running properly, to upgrade them, et cetera, and then at the same time none of these clusters ever will be utilized to a good extent. I mean, you will always have idle resources in those clusters and the consequence is that the overall environment will probably have a pretty poor utilization and the type of functionality that we provide allows to consolidate those clusters and drive up utilization. Actually, the larger 50% utilization that I have here on the slide is very, very conservative in the environments that I was talking about before where our core technology is being used. Meanwhile, I said hundreds of thousands of cores. We actually see regularly utilization rates of 80% and in some cases way above 90%. And that's necessary. I mean, if you have environments that big, they may cost $100 million total cost of ownership. So a few percent of utilization make a big difference there. Some of the unique capabilities that our solution provides. So first of all, something that isn't really currently in any product as far as I can see in the Kubernetes space is that we prioritize workloads and that's pretty much automatic. So by way of getting the workload properly submitted to your system and advertising certain things to our scheduler, we prioritize that automatically and dynamically. We have a sophisticated policy system as we shall see in a slide that follows shortly. And we provide mixed workload support, so for containerized and non-containerized workloads. And then we have a whole set of functionality that makes it easier to use the system. So for instance, a web UI to drive the policy configuration and of course the CLI and the REST API as well. Our workloads are affiliation or our workload decision making is affiliation based. So as I mentioned before, when you submit your workload properly, then we will know where it comes from. For instance, who is the owner, who is the project, is there a certain workload template that you want to use? And that will be used in the automatic policy decision making. We do support any Kubernetes distribution. Our solution is totally pluggable. I'll talk about that also in a minute. And you can actually reconfigure the policy system on the fly. There is no need to stop any components and restart them if you have made a change. It's just really changing some of the policy configuration in the web UI, for instance. And immediately the changes will take effect. So very, very simply stating what NavOps command is. It is a kind of replacement for the Kubernetes scheduler. That's not 100% correct really. What happens is that we do install NavOps command side by side with the Kubernetes scheduler. You can use the stock scheduler and you can use in parallel the Kubernetes scheduler. The command scheduler, NavOps command scheduler, four different types of workloads. But you could also, if you wanted to, completely replace the Kubernetes scheduler with NavOps command. That is a configuration option. In most cases, we probably would recommend to run them side by side. A few more words about the solution as such from a technical point of view. So first of all, NavOps command is itself a service. It's a multi-container application. It ships basically as a pod that you can just start or actually two pods that you can just start by curling a YAML file and getting it created in the Kubernetes cluster. It has a couple of components. I have an architecture slide on that later on. And itself, it interacts basically completely with the Kube API server. So there isn't really anything that it does that doesn't fit a regular Kubernetes system, hence it is pluggable. And as an end user, you continue to interact with the Kube API server. So you submit your job with Kube control, you make your changes with Kube control. The only thing that you do specifically if you want to make policy changes to NavOps command, then what you would do is you would use our web UI, CLI or SAPI to do so. Inside of NavOps command, it uses the policy engine that we have been developing for many, many years in our core business that I was mentioning before with high scalability, rich policies, and so on. Here is an architecture diagram of how it works. As you can see on the left side, there is the Kube API, and then the end user directly interacts with it. And NavOps command itself, again, also interacts with the Kube API. It basically registers events, gets any changes that happens to objects in the Kube API and integrates that with its policy system. And if there is a scheduling decision to be made, then it makes it in the same way as the stock scheduler does it. As an admin, you would be interacting with the NavOps command system through its web UI, CLI and REST API. The CLI is modeled pretty much after Kube control. So there's quite a familiarity there. And then, of course, the web UI does its own thing. And we will see the web UI in action later when Stefan runs the demo. A little bit of an overview of the policy system. We will see it also later in the web UI. But let me first start at the upper right. The workload affiliation is a core part. I've mentioned it before. So we have extended some of the labeling and annotation properties of a Kubernetes manifest, part manifest. So you can express these things. You can express who is the owner of a workload, which project does it belong to, and which application profile, which is kind of a workload template, does it belong to. And once you have done that, you've given our NavOps command schedule information that it can use then in the context of all these other policies that you see on the screen. We also do have the default policies that the stock scheduler Kubernetes stock scheduler has, meaning pack and spread when it comes to node selection. But we do have additional policies, for instance, the maximize utilization, which allows you to not only just distribute workloads, but to actually look at what's the current utilization of resources on a node. What's the type of requirement that a workload has, and then it looks for the best fit of that workload, so to really create balanced workload placement and good performance of those workloads. But that's just the node selection. We have, of course, additional policies, and those policies on the one hand are about workload priority. And there's a bunch of sub-policies there, for instance, the proportional shares policy, which allows you to subdivide your cluster into those multi-tenant partitions. There's an interleaving policy, which allows you to maintain certain ratios of replicas following policy guidance. There is ranking, which simply allows you to rank applications by application profile or by resource. And then there is workload isolation quotas, so, for instance, runtime quotas and access restrictions. We will see some of these in action in the demo later. So that just by way of overview. Here is just a screenshot of one of them. That's the proportional sharing. And that is a screenshot of the web UI. And it shows how you can more or less graphically subdivide your environment into different partitions and do that even in a hierarchical fashion. So you can subdivide a partition again and again, depending on how you want it to be done. Maybe one word before we dive more into the demo use case itself on is this all really necessary, all these policies, and so on. And our point is, yes, it is absolutely. We do live in a world where resources are finite. I mean, sometimes clouds give you the illusion they are infinite. You could always add another resource and another resource, but the fact of the matter is, if nothing else is limited, then at least budgets. And if you are using on-prem resources, then there are limitations. Anyhow, we see it ourselves in our own development. I mean, some of the grassroots things that we have done first with clouds have started relatively nimble. But once you let that go for a couple of months and you look at the bill that you're getting from cloud providers, you go, what? Are we really paying that much? And you immediately have to think about utilizing your resources better, because otherwise the spending gets out of hand. So if you recognize that there is a restriction of resources, which basically at the end of the day comes down. You have just a certain amount of servers. Then sharing resources is actually a key thing. And then you need to automate that type of sharing, because otherwise you're constantly in the business of reconfiguring your environment. So that's why we have been creating policies that can automatically maintain SLAs, resource partitions, and similar things like that. And our solution really is combining resources into larger clusters and get that sharing to work so you can actually achieve more with less spending. And the prioritization that you have there at the heart of those policies is really very, very important. You always have dynamic changes. I mean, you could have something that is crucially important now, could be less important when some other workload comes up that has a higher priority at that moment. So you always have to stay on top of those things, and there is really no way to do that other than automating information. Also, you want to give as much information as you can to a schedule you don't want to withhold things. So for instance, Kubernetes provides this notion of submission quotas or access quotas. These are good. These are sometimes useful, but they basically hide information from the scheduler that additional work really would want to run. And our belief is that you really have to give the scheduler as much information as possible. That's why, for instance, we have runtime quotas. So you don't have to hide something. The scheduler will make sure it doesn't run if it's not supposed to run. Also, if you give the scheduler all of this information and you look at the scheduler what the decision-making process is, then you can analyze why certain decisions were made and why maybe you were running against the wall. So for instance, if a certain type of service cannot get enough replica running, then looking at why the scheduler had to make those decisions may reveal that you're lacking some critical resources. Maybe you need to buy additional resources or allocate them through a cloud. So really, you can do some capacity planning. If you have a sophisticated scheduler and you can inspect what type of decisions it is making. Now coming to the demo use case and mixed workload. So first of all, the environment that we will be looking at is an OpenShift cluster. And we have command installed on top of it that manages some container-rised services and container-rised applications. And that gives you all that nice policy control that I was talking about. But what if you want to run non-container-rised applications? You could, of course, run them at the side of this environment and maybe split your cluster and have some part running container-rised services and another part running non-container-rised work. But the problem is, of course, first of all, again, you would be creating silos and inefficiencies. And then also, if those non-container-rised workloads need to interact with the container-rised workloads, then you will benefit from sharing the same networking setup, same storage solutions, et cetera. So what we have created is a version of our UniverGrid Engine technology that, as I mentioned, is integrated with many, many thousands applications, hundreds of workflows. And it actually can also handle container-rised workloads, but really more like batch workloads in that context, not services. But the idea is you run UniverGrid Engine, in this case, itself as a container-rised service. It runs basically a workload processing service, but the work it runs doesn't have to be container-rised itself. And that immediately gives you access to an integrated approach in that context. So the demo use case that we will be looking at is we have basically two teams that we are kind of emulating. One is the development team, CI-CD. And the other one is the operations team that has a batch team that runs batch workloads and the services team that runs service workloads. Then in terms of the workloads that are being run, there will be a number of development jobs and tasks for dev and test purposes. And then there will be services tasks also. And batch applications, those batch applications will be non-container-rised. And the demo environment is going to be hosted on AWS. And as I've mentioned before, it runs using OpenShift, and then Command is deployed as a scheduler into it. With that, I hand over to Stefan. And let me just stop sharing. I hope Stefan can give over. Make sure you unmute yourself. So can you see my screen? Yes, we can. Perfect. Thank you, Fritz. And welcome to the demo of the mixed workload support of NavOps Command on top of this Red Hat OpenShift system. As we heard, NavOps Command provides virtual multi-tenancy on top of Red Hat OpenShift. It allows to share one cluster among multiple teams, applications, or even services. It even allows to run container-rised and non-container-rised workload in the same environment by running our UniverGrid engine as a workload processing service for these non-container-rised applications on top of OpenShift. This is what I'm going to demonstrate you right now. So let's dive directly into what NavOps Command mixed workload support allows you to have running simultaneously. And here you can see the container-rised application services that are running on the cluster currently. So we have some production and service that is replicated two times. Here's the second one. A back-and-patch application currently scaled to something like eight instances. And we have the UniverGrid engine workload processing service currently scaled to one instance. As also the UniverGrid engine master service, which you can see right here on the last slide. So here on the right side is a few of the non-container-rised workloads running inside the UniverGrid engine service that is currently scaled to one part in the OpenShift cluster. As Fritz already said, UniverGrid engine is a leading workload management solution and is integrated with thousands of applications and hundreds of workflows. In our demo, we are running non-container-rised applications that we have currently up to two running non-container-rised applications per UniverGrid engine part. As you can see here, we have this job 75 and 76 running inside the UniverGrid engine service. So now let's have a look at how this environment is managed through NavOps Command and how we can modify and enforce the virtual multi-tenancy policies through it. Let me just change to the NavOps UI. So what you see here is an organizational breakdown of how the entire cluster resources are to be used. It is reflected in the so-called proportional share policy of NavOps Command. The full list of the diagram represents 100% of the cluster resources. At the highest level, we have split it between CICD, Continuous Integration and Continuous Development, and Ops, which is responsible for the production workloads. For CICD, we have Dev and Testwork, which has roughly configured to a shirt of the overall cluster resource. But for simplicity reasons, we are not running any Dev or Testwork as part of this demo. So let's focus on the Ops side that owns the larger amount of the cluster. We have split it down further between batch and services work with the bigger share is going to batch. And the batch resources we have configured mostly to be consumed by work in the back-end project, while only roughly a quarter is assigned to run non-containerized workloads that get managed by UniverGrid Engine, the service. We are also running our production service and have configured resources for that. First of all, we have also made a provision for running administrative tasks, but we are not going to demonstrate that right now here again to keep things simple. So the workloads you saw running consume cluster resource organized by the three lowest level projects of this diagram. So the Grid Engine project is using resources, the back-end project, as also the production project. Now let's make a change by moving this lighter between Grid Engine and the back-end project to give much more resources to UniverGrid Engine, something like 80-20. Save that configuration and switch back to our workload management system view, sorry. And this may take a while, but for clarity, I have opened another view where I do not show the depending parts as also the completed job. So I will not switch back to the OpenShift web UI anymore. So due to this change, we will see how back-end parts get scaled down and UniverGrid Engine execution demon parts should eat up the free resources. As I said, the shift will take a little while and we will change back to the nevabs command. So I can show you why that a couple of other policies you can change in the nevabs UI. You might have noticed that there are only 10 parts running at a time owned by the batch team. The UG exit parts and the UG masterpart belongs to the Grid Engine project and the back-end jobs belong to the back-end project. Both of them are part of the mixed workload namespace. Within for each namespace, you can set up so-called end quotas. And I set a quota, I called it limit batch, for this namespace which limits the Grid Engine as also the back-end project combined to run only 10 parts at a time. When we now get back to our view, we will see that there are running much more execution demons than back-end jobs, which is due to the change we did in the resource in the proportional share. And if you have a look at the UG Q Master, sorry, I got locked out, now we see that all these new execution demons are automatically added to the UniverGrid Engine cluster and all of these newly started UG execution demons are filled with legacy, non-containerized workload. So let's have a look what else can be configured within NetOps command. So as Fritz already said, beside the Kubernetes default placement rules pack and spread, we have also the so-called maximized utilization which tries to automatically balance the workload placement and mix, where you can specify entities which should get packed and entities which should get spread in your OpenGIF cluster. In NetOps command, you also have the possibility to create predefined profiles, the so-called application profiles. In this example, I created a profile for a web server, as also one for, for example, a database. In the database profile, it is configured with parts tied to that profile and needs to get dispatched to a node which has at least five gig of memory. Another interesting additional feature to Kubernetes is the so-called interleaving. For example, if you want to keep the ratio of one database part, let's do it, to five web server parts, you can configure it here, like that, 5% web server. This means that for each database part, you will be able to run five web server parts. So this concludes my brief demo of the mixed workload support of NetOps command on top of OpenGIF. So again, what you saw was a single OpenGIF cluster that rends standard containerized workloads along with Unibagrid Engine as a workload processing service for non-containerized legacy workload on top of that. I also demonstrated how NetOps command can be used to segment the resources in one shared OpenGIF-based cluster dynamically to provide this virtual multi-tenancy. Thank you for watching the demo. Awesome, Stefan. I'm looking to see if there's any questions from the folks who are participating here and watching on. And if you have them, raise your hand in the chat and we'll turn on your mic so you can ask them. I'll give you a second if there are questions. It's interesting because for me, I was thinking of Unibag as something more of an on-premise offering. And so the thing that I got out of this today was that this is not a great hybrid application as well. So it's been pretty interesting to see this and to see it do the things. The scheduler that comes with Kubernetes is pretty basic. And the things that you guys have done with your Grid Engine over the years have given you a lot of background and experience in integrating this and making the actual things, the reality things that we need for ops and to really make an enterprise offering on top of Kubernetes with the workload priorities is pretty awesome. So I'm pleased that we've gotten to get this integration done. I'm looking again to see if there's any other questions. Is there anything else you'd like to add, Fritz or Stefan? No, I mean, you actually summed it up very well, Dan. I mean, that's exactly what we're trying to accomplish, that people who are running this in serious production and inadvertently will run into the situation where they have lots of different teams, lots of projects, and so on, beating on the same type of resource that they have the ability to manage this properly. And if there is any follow-on questions, they come up later here on this last slide, which hopefully is visible. You can see how to get in touch with us. Perfect. And Stefan, thank you for the demo. Especially in the beginning, one of the things that I really like is that you're doing the myth busting about the unlimited capacity of the cloud, especially on premise as well. There's always a limit. Someone always has to buy more resources or upgrade things. This is quite an interesting solution. And I think we're going on your customer list there. I saw quite a few OpenShift customers, too. So I know we're doing a lot of work mutually with you. So this has been quite an eye-opener. So thank you again for coming today. And if you have any questions, please reach out to them directly or hit them up on the Slack channel. Also, all of these guys, plus a few more from Univo, will be at the OpenShift Commons Gathering in Berlin just under two months on March 28th. And if you're interested in coming, reach out to me. And I'll see if I can get you into it. It's co-located with KootCon. So if you're already going to KootCon, take a look at commons.openship.org. And you can find out all the information about the upcoming events, as well as all the upcoming greetings over the next few weeks and months. So thanks again, Fritz and Stefan. And thank you all for joining us. We hope you found this interesting too, and we'll talk to you again next week. So thanks, all. Yep, thanks, Dan, for having us. And thanks for everybody watching.