 Oh, works out of the box. Sorry for being the last talk today I'm yours from Winkle I Work for a tech finance for over 10 years today is a special day for me because I just turned 34 I have no pictures of my hobbies because I have no time. I have four sons also So I have to make fun at work, which I'm Gladly to present today What I find it is a 40 year old what you now would call a fintech company. It's a Dutch company. We operate the globally We're helping over 14 trillion euros and assets in making financial investment decisions We serve our customers globally with a lot of number crunching Workloads and for the past two years I've been responsible for our cloud native Transformation as a company and since I have a background in HPC myself. I really liked Okay, I will a freeze I Really liked onboarding our HPC workloads and Last year I was in Valencia to for inspiration. I'm very excited to share as a speaker now our journey so far so a bit of context We decided to do managed overdue it yourself, so that's why it managed Kubernetes We have different flavors, of course in the context of this talk. It's about managed open shift in both AWS and then Azure So the decommissioning of our on-prem data signers was a fact Big challenge to to move our HPC operations to this managed communities environments Next that it need to be managed We wanted also to to perform well at least at par as our bare metal who wanted to scale Not only up but also down our workloads are very peaky in the financial industry Beginning of the quarter end of the year These are moments that our clients needs thousands of CPUs and the rest of the time They don't need it and they don't want to pay for it. Also Concurrency a lot of talks today are about scheduling Yeah, we try to get rid of the scheduling problem as a whole we want a hassle free Multitenancy I will explain that later Vendor independence is also very important. Although we use managed Kubernetes We don't want to manage to marry a hyperscaler also We don't want to attach ourselves to one of specific Kubernetes distribution So we want a Kubernetes native solutions to run our APC operations on So high over how does it look like? What's under the new on the road which hyperscaler you use we don't we don't care We want Kubernetes and on top of that we selected Like Lego bricks we assembled our framework with very very battle tested open source projects Prometheus Keda key native and as a queuing system. We use the open source active that queue So these are all packaged in YAMLOS so how to configure them for our framework is few hundred lines of YAMLOS We we package them with customized Argo CD ready in a repository I'll share later on top of that We also template it how you should define your jobs and your and your tasks These are just 300 lines of C sharp. So that's truly our contribution. It's really about how The interface with the rest under the hood It isn't dot-net, but you can easily do it in Python as well So this is very lean setup and Our context is so in in our parts. It can be thousands lines of Python That's it's running, but we also have fat parts like six hundred thousand lines of C sharp in one container It's all one process per container context So how does it look like? so from left to right our users interact via rest API, which is keen which is Provisioned by key native. It's it's serverless. It's the HTTP traffic triggers awakes the API The API is responsible for putting a job on the job queue Kata sees the job coming in the queue and and creates his Kata skill job and That's nice that it's in the queue because if somehow the job is killed or Terminated as long as it's not acknowledged on the queue it will reboot the job. So it's very resilient in nature The job is responsible for putting tasks on the queue which allows us also to do a DAC workloads If tasks are put on the queue Kata knows how many pulse it wants To compute and these tasks runners are actually very stupid things. They just fetch from the queue execute Acknowledge the task and repeat Until the queue is empty So we have a termination of x seconds for after the queue is empty Then these tasks runners will terminate themselves and scale back to zero in fact All elements here in the picture scale to zero even the the queuing mechanism and this is not only nice for cost perspective But also very nice in in tenancy in a multi-tenancy context. I can make a namespace per user I can make a namespace per application Um So so far Scaling pots But a big question is if we can school pot's skating pots is easy But how in a managed context does the infrastructure skills along? So in cubanese contacts the cloud controller managers responsible for this and depends a bit on the hyperscaler of its implementation but Some managed cubanese Providers have machine sets no pulse. They all have different names for it but in essence VMs are provisions automatically based on pending pending pulse and By means of tolerations and tains We can provision many flavors of machine set with GPUs without GPUs Intel AMD different Memory architectures and also these machine sets scale back to zero, which is very interesting feature So how does it work in practice? So this is nice in theory. So how does it work in practice? So this is kind of the conclusion on on on these question. Does this Elasticity promise of a cloud that can you monetize on that in HPC contacts? So these are empirical results, but we were able to spin up 1500 CPUs within 20 seconds up of the 20 minutes up and back to zero We also were able to do this in spot instances on Azure Which is nice because there's 20 minutes of 1500 CPUs only cost us six euros, which is insanely cheap It's kills to zero which is nice Although I have to say if you use these open source project you want operators to be there Which have controllers so you pay the price of them Note provisioning is an interesting topic because it really relies on how fast your hyperscaler can provision these VMs And we tested both on Azure and AWS. It's roughly four minutes for a note to come up Regardless the VM family and regardless the hour of the day We thought okay, maybe in business hours is slower, but it's not the case. It's it's a bit volatile though In order for this to work You need a very good machine set or no pool implementation depends a bit on hyperscaler How mature these are? Regarding downsides. Yeah, these managed Kubernetes clusters come also with guardrails So sometimes they have a limited maximum amount of notes. You can hit at a certain point and also with Azure We we perceive that sometimes we got we we ask for certain VM type and under the hood We get different hardware newer generation hardware and that leads to to unexpected results I know in a view of West DC. This is more tightly coupled So to conclude Where can you learn more? So we run this stuff in production With our products We extracted kind of the bare minimum put it in into github. I have a student now That's working on a slurm interface like this rest API part You want to talk with the slurm language to it and see how you can upload your workload there We want to track carbon footprint in the in the Workloads we We run and furthermore another colleague of my Recorded the demo where you see this in in real life skating from zero machines too many. That's the youtube qr code Thank you all for listening Hope you have a good conference