 Hi, this is Josef Lombardi and welcome to a brand new episode of let's talk and today we have with us Andrew Beccany, principal engineer at caring group. Andrew is great to have you on the show Yeah, I think we're going to talk about a stateful workloads whether one should run database and Kubernetes or not How would you define a stateful workload? primarily for me persistent or a stateful workload is something we've got an application that has Unfortunately, it's a state of memory If that application needs to be say shut down, you know, forgetting about Kubernetes completely here Without application needs to be shut down. Typically the application will be writing to desk When the application comes back up, it will read from this and it will try and recover into the same state that the application was in before So I think the key takeaway is any application that's right Let's talk a bit about the the beginnings of Kubernetes it was State less but now folks are running a lot of a stateful workloads on Kubernetes You folks have been around for a while You have seen both side of the aisle, you know So talk a bit about how we have seen this, you know Of course with every technology folks do try to bring a lot of a lot of kind of use cases Which were not designed for that a specific technology Linux is a very good example Sometimes we are right. Sometimes we are wrong. But let's talk about how the you know Stateful workloads start making their way into Kubernetes and what are the arguments that hey, you should do that or not do that Back in the days early Kubernetes, right? We had a resource available to us called Petsets. So Petsets was almost a demeaning term for workloads that the developers of Kubernetes thought that perhaps particularly weren't well suited for and The Petsets have evolved into something called stateful sets now, which are very kind of mature. They've been Tied I think in like 1.6 Kubernetes. So they're very well known But essentially what a stateful set and the difference between a stateful set and I have said appointment or job in Kubernetes Is the fact that we're attaching the persistent disk to each of the Replicas inside the stateful set. So it means that we've got that might be a Kafka cluster For example, it might have 8 brokers on each of the 8 brokers are going to have their own persistent disk And importantly whenever Kubernetes comes along and does its job and tries to migrate or schedule the workload That persistent disk will follow The the replica around so whenever set broker 6 in this Kafka cluster comes up It will have the log file and the information that it expects being attached to it Can you talk about what are the general discussions that you are hearing or if you're whether it could be your internal team when you are like Looking at a specific customer use case or you know in the ecosystem in general where once again It's like hey, should we run, you know, stateful workloads on Kubernetes or not? Yes, and it's a constant conversation inside cake. I think it can be summed up by you know If there's a managed service available for your stateful workloads, so for example a sequel database I'd currently would typically recommend that the cluster go and leverages the managed service, right? So this isn't even just about you know having to manage it to yourself It's the fact that you've got people like Amazon like Google They're investing hundreds thousands of engineers of ours into making the managed service more, you know Available performance and everything else. We don't believe that somebody more customers What's going to bring the value to or even have the engineering kind of Capabilities to go make the same investment. So there are some kind of no-brainers if you like that as a sequel is one of them Now we don't live in the world where everything that is that or stateful is in a sequel database It's kind of in a little bit of a lull now But two to three years ago and a car can use when I was working with a colleague where it was blockchain And there was no managed service for the type of blockchain We there are managed services for block teams, but not for the specific private blockchain that we were using And we made a decision that we thought that it would work well on Kubernetes So that was one this that was one decision that was backed at card We worked with kind of not for customers and that was back with us and we kind of We wrote an operator that we probably talk about in a little bit, but that operator You know helps the business logic and everything that we needed to do to run the the blockchain application on a Kubernetes cluster and saying that even when I started using Kubernetes in my 2017 my first kind of Challenge the running or my first workload running in Kubernetes was a stateful workload So I've been one of these guys that have been constantly since you know 2007 you know going kind of against the common knowledge that you know the Kelsey Hightower tweets, you know about You know anything in Kubernetes apart from your movie stateful workloads or databases I have kind of been doing that from the beginning anyway. I do understand and you know, not only you know Kubernetes itself, there's still a little bit of a Thought that it's hard to get the grips with hard to control operations wide And then you can see that obviously people struggle with their own way balls We've always had this struggle with the database that hasn't performed. Well, are you crazy thinking we're gonna put it into Kubernetes? I think in that case, you know work with the customer Can you just go into a specific or maybe some use cases or you know, especially, you know Once again workload where you're like, hey, no, we should not be running stateful workloads in Kubernetes We've touched on with her. I think if there's something that is like missing critical if it's a production. It's your kind of Really your bread and butter We would definitely suggest if it's sequel we looking at a mounted service up It's available. Obviously this gets complicated whenever you're running Kubernetes on-premise. You don't actually have Sometimes the the ability to even connect out to the mounted service in that case We will be saying that you probably are an enterprise. You might have something like a database hotel You've got a team of people that are running databases in your enterprise Why not go to them as a customer and ask them to provision you with the database one-sector? So if you've got those operas if you've got those kind of capabilities available to use an enterprise Then we would recommend that you take and leverage them when they look at some of these modern technologies They kind of come into existence to solve a specific problem But as the user base grows, you know, suddenly we start seeing a lot of use cases that we did not even envision the very beginning What kind of major trends that you're seeing in the Kubernetes space when it comes to a stateful workloads or databases where you're like Hey, this is kind of becoming a new norm there. Yeah, definitely the amount of customers and you can see even from the Twitter traffic You know, there are proper enterprises being built on running cloud native on Kubernetes at the moment I you know, I'm a bit past I've worked with Kafka for a long time So I think confident are a good example of this, right? You know the value-added company for for for Kafka they as part of their enterprise suite They have a Kubernetes operator You know, which will handle all of the log rotation and scaling and all of the stuff that a Kubernetes API is not going to be able to do for you And that you know business logic needs to be encapsulated into An operator and that operator even that that's another good point that operator framework as well is much more mature You know, it seems like whether it's an actual database may sequel whether it's Kafka and the coach DB's for example, they've all got operators now, which will allow you to Confidently run these databases inside a Kubernetes cluster as an organization was Kubernetes You know, we are we have started to see a lot of things in production now as Organizations they do look at you know running their stateful workloads Databases in Kubernetes What advice you have for them? What is the right approach for a stateful workloads on communities? Yes And so here's the cop out from my perspective. There's no right or wrong answer here Unfortunately, what I would say is that running a stateful workload inside Kubernetes is not fundamentally wrong It might be quite hard But I think the point is if you're running it on say a suite of VMs I'll have you know minks in front of them or whatever your infrastructure might be you're gonna encompass the same challenges as you do in Kubernetes Now Kubernetes is made It's a very kind of rapid roadmap and there seems to be more and more capabilities with Kubernetes that allow you to treat it goes a little bit against kind of the mantra or the ethos I think of Kubernetes that you do have to put a little bit of control or you need additional YAML additional Guard reels if you like on this workload. It goes back to the pets versus Cal kind of situation that we've got You know back in kind of 2000 2018 kind of time, you know that you've got You've got a really kind of Keep an eye on and manage and you've got a kind of work with them differently to your stateful workload For example when it comes to running a stateful workloads or you know databases on the Kubernetes There is already a complexity with there and then you're also moving your database there What the reality is that we are not going to reduce this complexity What is going to happen is help customers deal with this complexity talk a bit about how carrot group is helping or how you see Customers are looking at because they should not be wasting too much too much of their time in in managing Kubernetes and all those Cluster, they should be focused their developers should be focusing on writing business applications That's a great point and it's exactly one of the reasons why I think that this these days We shouldn't be afraid of running it inside Kubernetes to take this the other way actually in fact if you're gonna imagine You've got you know, 80% of your workload as an organization. There's now been maybe over the past three four years You know what modernization has been moved to Kubernetes and containerized and whatnot You know why you then go and suggest that you have to go on or another to see in another suite like Ansible Chef whatever it is to go manage a suite of VMs to run the database on and so even even from a level below the developers even the operators of this By not having the database of the state for workload in Kubernetes You're effectively mandating that there's a whole other to see and there's a whole other room book and everything else that on Operations team we already under pressure need to go and look at and it does feed into topics that you're going to cover like Observability like logging and everything else You have to run a separate essentially stacked. You might need to run a separate, you know, class to look at that Database or blocks in whatever it might be Compared to say your front end or your microservices are the rest of your workload And you thank you so much for taking time out today and talk about this topic And as usual, I would love to sit down with you again and discuss the topic. Thank you