 So good afternoon everyone your best Italian host ever. It's again here We have had all lunch. We all have our refreshments. I See no, okay We have now Robin from Sipman is Robin Sipman from IG lead developer. It's I must say it's It's beautiful to have this is our first Consumer of Kubernetes coming on stage. It's we always see vendors Getting their talks through as not brother pitch is fantastic You guys are doing a great job But it's also nice to see consumers to see what the troubles that they go through to to enable Kubernetes within their organization so Robin Would you like to join me? Yes. Thank you very much. Give it up. Thank you So the mic is working that's good and everybody has a full stomach, I hope so that's nice I'm gonna talk about I and use container hosting journey. So Like the red is that we have our own private cloud and we're building container hosting there And one of the servers that we offer on that container hosting is what we call namespace as a service And this talk is mainly out. What does that infrastructure actually look like and how did we build that journey? So first a little bit. My name is Robin Sipman I'm the lead developer in the ing container hosting platform team That's what that acronym stands for and ing is a bank, but we write a lot of software We are very apparent in in Europe, but we are all over the world And in fact the container hosting platform team the team that I'm in is also a team that has many nationalities in there I think that's very cool And today I'm going to talk about why do we actually do namespace as a service as opposed to maybe giving full clusters out in the private cloud And then what does that stack actually look like in the private cloud? structure and then I'm going to zoom in on How we built the namespace as a service on top of OpenShift So I already spoiled it. We run OpenShift instead of Kubernetes on top of Kubernetes I'm going to talk a little bit about the dependencies that we have to to build this this journey And then I'm going to zoom in to film also the the controllers that we have some of the applications So there's some Python code in there. There's some go line code in there So I hope you're excited for that and of course a demo. I recorded it though So it's not a live one, but it's a demo nonetheless So let's dive in so we only offer namespace as a service and The goal is I'm just going to go through the full slide It's a lot of text, but in ing we have a full cluster right and We don't want to give full clusters to consumers because then we give out a lot of notes a lot of resources And they will not be utilized fully But if we built a multi-tenant cluster and we over namespace as a service Then we are in full control of the compute and we can give out the resources to namespaces of those Exactly what they need right so if one namespace of one application requires 10 CPU cores and 10 gigs of memory We give it to them and another namespace requires etc etc For we spread it out over the cluster, but we are in control of the compute and that gives us a number of advantages Also in terms of compliance the the people that request a namespace. They don't have to worry about the underlying infrastructure Right the cluster it works They they don't have to know what compute nodes they are running on If they are patched stuff like that they offer or they take the namespace that we give them as a service and they also Get the compliancy on top of that so they know that the platform. They are running on it has been pen penetration tested It has all the risk controls in place and so on It does mean that these teams we have a lot of them We have hundreds of teams that run that use the same Kubernetes cluster They are in control of the stuff that they deploy in their namespace so the compliancy aspect of the cluster Yes, that's in the container hosting teams control but anything that you Deploy on top of that. That's the the application team needs to maintain that We will zoom in on that later if you want to know more about what it's like to run on the container hosting team There will be a talk from Adnan tomorrow and he will give you the journey So what use cases do we actually support on our clusters? We have two main use cases one of them is 12 factors. I'm wondering here a show of hands Who here knows what a 12 factor application is? That's quite a lot. That's good. That's good for those who don't know 12 factors The easiest way I would describe it is that you focus on your app being stateless So that makes it very easy to scale and if one of your applications gets killed It doesn't matter you just spin up a new pod. So it's all about portability scalability and so on So it means on the Kubernetes cluster itself. There is no persistent volume claims. There is no persistency at all If you need persistency in an as an application you connect to an external database So external as in it's not in the side the Kubernetes cluster, but still within your own cloud, right? So these are this is the main type of workloads that we offer For consumers or for data services providers So these are teams that really know a lot about how to handle persistency how to handle storage For those workloads. We actually offer a storage which is built on top of per port works where they are for us They are completely separate clusters So this namespace is a service offering what I'm talking about. What does that actually mean? Like why do why do you have to build so much stuff for it? You can just do OC ADM new project and you're done, right? Well, not really because in the fact that if you do OCM new project Then how do you know which users have access to that certain project? You need to give people access to it, right? A group maybe you're also gonna need some some networking in there So maybe this namespace needs its own IP address. Maybe you need your own private network and so on We in ING we have Azure DevOps where we deploy our codes to on premise But there also needs to be a connection between that on premise cluster and Azure DevOps So how is that then created and furthermore we have multiple data centers in ING So if one data center goes down we can move to the other one But it also means that if you request an namespace in one data center, it needs to be available in the other So those are actually a lot of steps that OC ADM new project doesn't do Yeah, and we have automated that and that's mostly what this presentation is about To do all of that. We have a lot of components I have listed them here. One of them is the ICHV API that knows all about the different clusters in different data centers How to orchestrate it then we have project controller that ensures that namespaces are on the cluster and Yeah, there's more. There's off-delegator, CDAS controller, image reporter, pod researcher and quota autoscaler There's a huge list of them. We have about 36 at the moment Obviously, I can't handle all 36 of them right here But I will zoom in on three of them. So I'm going to talk about the ICHV API project controller and What is it the autoscaler there in this presentation? But before I do that, I want to show you what is our infrastructure actually look like so For ING we have a data center We have multiple data centers and then we have a team that offers bare metal as a service So they make sure that the physical compute nodes that come in they are registered And ready to be consumed by by platforms And then on one side we have Azure DevOps Well, I should say ING one pipeline And it which happens to run in Azure DevOps and then we have a lot of apis that we depend upon So we have seen the be that's for asset registration. We have a charging endpoint There is some monitoring and logging in there security monitoring And networking so these are all apis that are offered in the ING private cloud Now that we have our infra code so everything for us is infra s code And then we provision nodes via that using an ipi installation running open shift 4.10 by the way And then we get our open shift container platform So this is just this is an open shift installation without any of our own components on top of it yet Then we have all the applications that you just saw they're also In infra s code via git ops We deploy them via args cd on the cluster and then we truly have the ing container hosting platform And you can also see that These components they connect to the apis that are there right the internal infra apis And then finally we can offer the namespace as a service on top of that So we have a cloud portal inside ing very similar to to what the public clouds have Where you can click like hey, I want a new namespace and you go through it And then actually you call our apis and then the consumer has their namespace And of course the consumers once they have their namespace they want to deploy their application So they have their consumer code also in the one pipeline and that goes into the namespace Any questions about this slide? Because then otherwise have to go back No, that's excellent So let me zoom in on some of the applications that we have the first one I'm going to handle is called project controller project control is written in python And it's for namespace configuration management When we originally started building this component We were very well first in python in the team, but there was no python operator framework for kubernetes So we build our own Which we call scaffolds And I will zoom in on that And what the project controller actually does is it takes a specification of a project inside ing So for example, hey, I want a new project with this name and I have I need these resources It's bound to this group and then the project controller it creates all the resources associated with it So it creates the namespace it creates resource quotas it creates role bindings And so on But let me zoom in on the scaffolds framework first So the way the scaffolds framework works is You have your own application. So this is if you're building python code, right? You have your own application and then you import this stream watch the stream watch is offered by the scaffolds package Which listens to a kubernetes object? So this can be a custom resource that you have to find yourself or maybe it's conflict maps or whatever you please, right? and then Via a watch the stream watch listens to that and then it calls event listeners So event listeners also a class that the scaffolds framework offers And that's what you inherit from so you have a bunch of classes inherit from event listeners So this might seem a bit abstract, but if I zoom in like this You can see that you implement a bunch of event listeners So for example in the in terms of project controller an event can be like, hey I want to create a new namespace. So you have an event listener for a namespace You have an event listener for a resource quota And then when a new specification comes in so a new custom resource comes in So let's say somebody tries to create an ihp project Then it goes to the first event listener that says all right I need to create a namespace if that's successful it moves on to the next fund It needs to create a resource quota. All right cool that it moves to the next one creates And now roll bindings and so on and the cool thing is that all these event listeners. They are called in order But if one of them fails So let's say one of them fails then the ones that have already been called they will be rolled back That's what you see there with with the rollback Another component we have is called the ihp api So the component you just saw the project controller. It's very cluster specific It only knows about clusters, but the ihp api it knows about different clusters And that's what you see here. So we have a user it goes to the workflow It clicks. Hey, I want a new namespace Then you hit the ihp api and it calls all those infrastructure apis that I mentioned before And then it calls The cluster api on every cluster Which generates the ihp project spec and then it's picked up by the project controller and then we have all these resources there Well, that's a lot of orchestration a lot of things that need to happen And all of it is of course time consuming First let me take a look. So this is what the api spec actually looks like So these are the things that we offer you can get some information about your namespace You can create one update it and delete it and you can also patch it. For example, if you only want to update your resource specifications, then you can do that But in order to go through all these steps Or in order to get the proper namespace these are the steps that we need to do So a request comes in and if we first need to get some network information And that network information needs to be registered in the seem to be so that's the asset registration part And then lastly if that is all complete then we need to charge for it and we actually need to create the namespace on the cluster Now all of these stages they happen sequentially So only after stage one is completed. We move on to stage two and so on But all of the units in a stage so these blocks on the left and right there are called units. They are concurrent so Registering stuff in the seem to be it takes a second or so, right? But we do it all at the same time Same for that to speed up the process But it does mean that if we are in the first stage We don't want to wait all the way up until stage three to find out that hey, maybe there is a naming conflict, right? So at the first stage we already do a check stage we call it So we do some sanity checks if the request is likely to succeed And only when the request is likely to succeed then we continue with the run stage And then we do some actual networking and so on So, yeah If the flow is all good, but at the last time, you know, like at the last unit something fails Then we are in a little bit of trouble because we have executed a lot of actions Right, we've created we have done asset registration already and the networking is done But somehow we cannot create this namespace on the cluster In that case we need to roll back everything that we have done so far Now what does that actually look like in code? So all the steps that you just saw in in the diagram. These are it is in code So we have these the first curly brackets that is a stage block so to speak And you can see the network creation step there And then we have the other steps that seem to be create The charging and the namespace, but they are dry runs in the first stage And then we have all the other steps, right? I hope this is Clean to see and then finally we execute Everything when the in the run stage cluster actions and reply call All right And the last component I want to show you is called quota autoscaler. So We found that on our clusters Or when you request a namespace The requester of the namespace has to fill in how many resources is your application going to consume And at the moment where you're requesting it, it's very hard to estimate, right? Some people haven't even started writing their application yet and we are already asking them to give an estimate on how much they are going to use So that's very tricky And on top of that Then you start building your application, right? And you also need to fill in these pod resource requests So how many cpus is your pod going to use and then again you're going to make an estimate And maybe after you've you've seen it running for a while Then you know what your application actually uses like you do your performance test Then you really know what it what it does But turns out that Knowing all your resources Before you request your namespace. That's really tricky and to help with that. We have a component called the quota autoscaler So what we do is we look at the namespace resource quota So a resource quota is basically You know, I hope everybody knows what a resource code is because we're all kubernetes. But in case you don't A resource quota is basically a limitation of the compute resources that your namespace can consume So the sum of all the pod resources in your namespace, they cannot exceed what it says in your resource quota Okay, so this is also what you pay for by the way in the in the ing private cloud everybody pays for their resource quota Now if we notice that that resource quota is getting full We can watch that using this ishp quota scalar component And when it's almost full we can automatically do do calls to our Charging endpoint so to the ishp api we say hey this quota is almost full This team needs more resources then we call our automation and which increases the resource quota This behavior is managed by a custom resource called a quota scalar object Which looks something like this Um I hope it looks really familiar because it's almost the same as a horizontal pod outer scalar So you have some behavior and you can also set some minimum Values and some maximum values And you have to read the behavior like what is the What would you like the cpu ratio to be? So If you set down the values 50 and 70 it would mean that I want my resource quota to be between 50 and 70 percent utilized If you put down 100 percent it means you want the 100 efficient usage of your resource quota That means that yeah, there is no space for pods essentially to come up. You're fully using your resource quota I want to share some metrics with you with how that affects the resource allocation on a cluster So not too long ago. We were running on oba shift 3 11 And this was what the cpu allocation looked like so we have over 9 000 cores allocated in resource quotas These are the actual Request so this is what the resource quota looked like what the the namespaces have what you pay for And then the request is the sum of all the pods like what do the pods actually request And you can see that teams highly overestimate how much resources their application actually need And you can see the usage there As well, so you can see that the memory usage in terms of request is quite good And for cpu, uh, yeah, there is a lot more burst to it And now we built another cluster an open shift 4 cluster Uh, and there we implemented the quota scalar and you can see the difference, right? So we went from 9 000 cores to Give or take what is it 2000 or 1700 core and also for memory So we users no longer have to worry about What they fill in for the namespace resource quota. It kills automatically And one other thing we did is that for dev and test workloads We saw that there were a lot of high cpu requests on a pod level And this also allocates the resource on the cluster, but for dev and test namespaces Yeah, you don't they can be a bit more Flexible right so for dev and tests we automatically scale down the cpu to 10 millicores But we still allow users to set their limits as high as you can So for those who don't know if you request resources in a pod You have resources to request requests are resources that you are guaranteed to have Whereas limits they are reached only if the resources are available on the cluster And since we have so many compute nodes on the cluster with us you're Almost bound to hit your limits anyway So this allows us to Gain in this margin here So we have the request for v1 and then for dev and test We force it lower and that gives us a nice bump there Obviously for memory we can't do that because we can't if you memory is not as compressible as cpu So here are some other metrics. So here you can see the quota scaling and there you see the the pod resource mutated I was talking about By implementing the quota scalar feature we save 7.6 thousand cpu cores In namespace resource quotas and with the mutation there for the dev and test namespaces we save 500 cpu cores in in the requests Yeah, so I already mentioned this right so the pod resource mutator for development and test namespaces only Forces the pod cpu request to be 10 millicores But cpu limits can be capped as it is All right Last but not least I have a demo It will be a very quick demo It is a video but it shows the full stack as I've just told you in in the story So we're gonna requisition a namespace Via the ishb api After the namespace has been requisitioned We're gonna scale up some some workload in it And then we're gonna see the quota outer scalar in action All right Here we go So there we go Here is a namespace specification for the ishb ios so it has a name We have some workload type we have some resource that we wanted to have Then we post this payload to our automation right we create a namespace So this creates the namespaces on multiple clusters We can also see what operations it did So we can see that there was some networking in there. There was some asset registration and finally It was created on the cluster Those are all the steps Now if we actually look at what what is on the cluster for the automation on the cluster it creates its ishb project spec Again, it has the the same name. It has the same quotas It has the same workload type And there's the name and this is then picked up by the project controller And it that creates in turn creates the namespace It creates the resource quota So you can see that whatever in the specification is reflected in the resource quota there It also creates role bindings. So the what groups have access to this namespace And you automatically get a quota scalar Now we have our namespace. Let's deploy something in it. So I have this nginx pod here It has some cpu Request it has some cpu limits So let's make it happen It's starting up And we can see we are using some of our quota right We're using 250 megabytes out of one gigabyte and our quota. We're using one out of four cpu limit So we can scale up to four replicas before we hit our resource quota So we are completely full right now if you try to create another pod it will fail because your resource quota is full So we have four pods running And now if we scale it up to six We get all these scary error messages because well, you're it's forbidden, right? You're trying to break out of your quota But this is where the quota autoscaler comes in It's actually listening to these events and it knows that you are trying to scale up outside your quota And it sees here it calculates based on the specifications of the replica set How much resource is it actually you need more? And then it calls our Our automation and it updates the resource quota And we can see it's actually updated there And all the pods are running So that's what you mean that what I meant with the 100 efficient resource quota usage Yeah there's more I have I have one more slide so I showed you a lot of stuff. I showed you some python code. I showed you some some other stuff I mean, it's really nice to boast about it was not very useful for you unless you can touch it yourself So can you have all this code and the answer is yes. Yes, you can So at at kubecon we will open source this live Yes, very cool And what will you actually get so you will get the code for the quota autoscaler the full component of the quota autoscaler you will get it for the The operator the python operator framework. We will also open source it. You will not get a project controller yet We are also planning on open sorting yet. Maybe but for now you will get the python operator framework and for the The orchestration you saw we had all these lines where we have like these stages that we run sequentially and the units all concurrent That's what we call orchestration and that will also be open source so yeah That is it. So now you can Are there any questions there are questions But you guys don't have microphones, so that's uh, thank you for your talk I want to ask do you use this in production because I've seen data from non production and I wonder if I if you already use that Yeah, we we use everything in production what you see here But the components they are all replicated across different classes. So for dta There is an instance of the ishp api quota autoscaler running and for production. We have another instance running Across multiple clusters as well. So yes, everything you see is is running in prod I wonder why the choice of giving the data for non production is a little production because the It will be more impressive maybe to see how much resources you save In production. Yeah, the the reasoning for this graph is that We notice for dev and test a lot of resources are allocated Um, and for production workloads, we are not very eager to scale down the request Like we if a team says hey, they need that many resources to run in production. We believe them We don't we're not going to touch that word for dev and test. We can be a bit more aggressive Um, and yeah, that's why those metrics are a bit more interesting, but For the production metrics, they are Perhaps a bit less Because you don't have the pod resource mutator, but for the quota autoscaling level is still Very beneficial Yeah, you're welcome And we have a similar workload And uh, I wonder well, we made some different decisions and we used actually used some Market solutions like capsule for creating namespaces automatically by The developers and themselves also the vertical pod autoscaler But the last one. Yeah, I It's not working properly for us So maybe this is Convenient to use but I'm wondering why are you? I mean why is Angie? Developing these solutions themselves because there are some solutions in the market Yeah, so for The namespace as a service part. We have a lot of Interfaces inside ing that are very ing specific like for example our our networking implementation That that yeah, it is just very ing specific. So we cannot ask a vendor to to build that for us Uh, and for the second part of the the scaling part So it actually works with the horizontal pod autoscaler as well as vertical pod autoscaling you can use both of it And this quota autoscaler kind of runs on top of that So you can see it as like a management for your namespace research quotas instead and We talked actually with multiple vendors and multiple companies And none of them have built it yet. So Therefore we did it and you can also use it because when it's open source So I hope that answers your question Okay, thank you One question Because I saw that there is a cmdb record against the namespace So can uh with this orchestration can you get multiple namespace per? cmdb record or it's one namespace per cmdb record Yeah, so what we do is Because we run many many clusters And all these clusters they can have the same namespace right depending on how many times you want it replicated And what we register in the cmdb is we combine the namespace name with an identifier of the cluster that you have So let's say your namespace is example Right, and then we will have example one two three depending on like one two three being the cluster identifiers Does that make sense? Uh, yes Hi, thanks for the talk Do you have any restrictions on the upper limit for that quota autoscaler in case of misuse or security instance? Yes, yes, we do and uh, I'm sad to say it's also necessary Yeah, there is a there is a limit. We have uh from the top of my head. I think we have like 35 cpu cores and Like a 150 gigs of ram per namespace Uh, it can be that some of the workloads that we have They actually have an extremely high load and then they use all these and perhaps they need more and usually These are yeah big consumers that we talk to them about their use case because if a team hits these limits without talking to us Uh, then usually they are over committing the resources that like they are requesting more resources than they actually need So this limit is in place And then we talk about it on a team to team basis if they need more We have a question in the back in the back of the room robin. Oh, yes Hey, you did explain about the support for 12 factor applications I was just curious that iCHP support stateful applications as well or Yeah, so we currently only provide stateful applications for data services So for most of our consumers, it's all stateless 12 factor but for With inside ing we have providers that offer data service. For example the elk stack. You might know the elk stack So these guys worked in ing for five years Okay, so the yeah, so the elk stack is a consumer of persistent storage It uses the same automation in in the end Robin, are you okay to take one last question? Yeah, of course Thank you. I have a small question since you're basically in the Like private cloud service provider How many clients do you have at this moment more or less? So inside the cluster we have a little over 2000 namespaces And I think it together with the non-prote and prod we run about I think it's roughly 5000 pods and that's per Environment, right? So we replicate everything in another data center as well Release the metrics you're looking for or I was because I was curious actually about the more technical Part of this which is the persistence of the kubernetes clusters that you're using under the hood. Basically Did you go with the custom solution regarding the Storing of the state or are you still using the standard one which is etcd? You were still using at cd. Yes, but for the So are you referring to the storage we use for our own apis or the storage we offer for our data services consumers? No, the internal one specifically for the need of the kubernetes clusters. Yeah, it's at cd. Yeah It's okay. Thank you And we found at cd. I think it's it's performant But it's keep a good eye on the metrics and you may also need to fine-tune it, right? There are quite some nice articles about it where you can say, okay I have a huge cluster what metrics do I need to fiddle with? Or what? parameters, yeah all right If you have more questions, I'll be walking around still so Thank you very much for listening. I have a great day Well, thank you very much robin I am a hundred percent sure that people will come find you Not necessarily for good things we have Now we're going to restart at 2 30 with ara and she's going to show us about The gateway api so you can stick around you can go get some refreshment. Please be on time We're going to start 230 sharp. Thank you