 TheCube presents KubeCon and CloudNativeCon Europe 2022, brought to you by Red Hat, the CloudNative Computing Foundation and its ecosystem partners. Welcome to Valencia, Spain, and KubeCon, CloudNativeCon Europe 2022. I'm your host, Keith Townsend along with Paul Gillan. And we're going to talk to some amazing folks, but first Paul, do you remember your college days? Of vaguely, a lot of them are lost. I think a lot of minds are lost as well. Well, not really. I got my degree as an adult, so they're not that far of pass. I can remember because I have the student debt to prove it. Along with us today is Kenneth Holste, Systems Administrator at Kent University and Marcel Hilt. Seeing your manager software engineering at Red Hat, you're working in the office of the CTO? That's absolutely correct, yes. So first off, I'm going to start off, you Kenneth. Tell us a little bit about the research that the university does, like what's the end result? Oh wow, that's a good question. So the research we do at university again is very broad. We have bioinformaticians, physicists, people looking at financial data, all kinds of stuff. And the end result can be very varied as well. Very often it's research papers or spin-offs from the university. Yeah, depending on the domain I would say, it depends a lot on... So that sounds like the perfect environment for cloud native, like the infrastructure that's completely flexible, that researchers can come and have a standard way of interacting, each team just uses resources as they would, they're nirvana for cloud native. But somehow, I'm going to guess HPC isn't quite there yet. Yeah, not really, no. So HPC is a bit, let's say, slow into adopting new technologies. And we're definitely seeing some impact from clouds, especially things like containers and Kubernetes. So we're starting to hear these things in the HPC community as well. But I haven't seen a lot of HPC clusters who are really fully cloud native. Not yet at least, maybe this is coming. And if I'm walking around here at KubeCon, I can definitely, I'm being convinced that it's coming. So whether we like it or not, we're probably going to have to start worrying about stuff like this. But we're still, let's say, the most prominent technologies are still things like MPI, which has been there for 20, 30 years. The Fortran programming language is still the main language. If you're looking at compute time being spent on supercomputers, over half of the time spent is in Fortran code, essentially. So either the application itself, where the simulations are being done, is implemented in Fortran, or the libraries that you're talking to from Python, for example, for doing heavy-duty computations. That backend library is implemented in Fortran. So if you take all of that into account, easily over half of the time is spent in Fortran code. So is this because the libraries don't migrate easily to distributed, to out environment? Well, it's multiple things. So first of all, Fortran is very well suited for implementing these type of things. Right. We haven't really seen a better alternative, maybe. And also it will be a huge effort to re-implement that same functionality in a newer language. So the use case has to be very convincing. It has to be a very good reason why you would move away from Fortran. And at least the HPC community hasn't seen that reason yet. So in theory, right now we're talking about the theory and then what it takes to get to the future. In theory, I can take that Fortran code, put it in a compiler that runs in a container. Yeah, of course, yeah. Why isn't it that simple? I guess because traditionally HPC is very slow at adopting new stuff. So I'm not saying there isn't a reason that we should start looking at these things. Flexibility is a very important one. For a lot of researchers, their compute needs are very peaky. So they're doing research, they have an idea, they want to run lots of simulations, get the results, but then they're silent for a long time, writing the paper or thinking about what they can learn from the results. So there's lots of peaks and that's a very good fit for a cloud environment. I guess at the scale of university, you have enough diversity in users that all those peaks will never fall at the same time. So if you don't have your big own infrastructure, you can still fill it up quite easily and keep your users happy. But this bursty thing, yeah, I guess we're seeing that more and more. So Marcel, talk to us about redhead needing to service these types of end users. That can be on both ends. I imagine that you have some people still writing the Fortran, you have some people that's asking for objects-based storage. Where's Fortran, I'm sorry, not Fortran, but where is redhead in providing the underlay and the capabilities for the HPC and AI community? Yeah, so I think if you look at the user base that we're looking at, it's on this spectrum from development to production. So putting AI workloads into production, it's an interesting challenge and it's easier to solve and it has been solved to some extent than the development cycle. So what we're looking at in Canvas domain is it's more like the end user, the data scientist developing code and doing his experiments, putting them into production is that's where containers live and thrive. You can containerize your model, you containerize your workload, you deploy it into your OpenShift Kubernetes cluster, done, you monitor it, done. So the software development and the SRE, the ops part, done. But how do I get the data scientist into this cloud-native age where he's not developing on his laptop or in a machine where he SSHs into and then does some stuff there and then some Swiss admin comes and needs to tweak it because it's running out of memory or whatnot. But how do we take him and make him, well, and provide him an environment that is good enough to work in the browser and with an IDE where the workload of doing the computation and the experimentation is repeatable so that the environment is always the same, it's reliable so it's always up and running, it doesn't consume resources although it's up and running where the supply chain and the configuration of the modules that are brought into the system are also reliable. So all these problems that we solved in the traditional software development world now have to transition into the data science and HPC world where the problems are similar but yet different so it's more or less also a huge educational problem and transitioning the tools over into that. Is this mostly a technical issue or is this a cultural issue? I mean, are HPC workloads that different from more conventional OLTP workloads that they would not adapt well to a distributed containerized environment? I think it's both. So on one hand it's the cultural issue because you have two different communities, everybody is reinventing the wheel, everybody is some sort of siloed so they think what we've done for 30 years now there's no need to change it and that's what thrives and here at KubeCon where you have different communities coming together, okay this is how you solve the problem, maybe this applies also to our problem but it's also the tooling which is bound to a machine which is bound to an HPC computer which is architecturally different than a distributed environment where you treat your containers as cattle and as something that you can replace, right? And the HPC community usually builds up huge machines and these are like the gray machines so it's also a technical bit of moving it to this age. So the massively parallel nature of HPC workloads you're saying Kubernetes has not yet been adapted No I think that the parallelism works great it's just a matter of moving that out from an HPC computer into the scale out factor of a Kubernetes cloud that elastically scales out whereas the traditional HPC computer I think and Kenneth can correct me here is more like I have this massive computer with one million cores or whatnot and now use it and I can use my time slice there, book my time slice there versus the Kubernetes example or concept is more like I have 1000 cores and I declare something into it and I scale it up and down based on the needs. So Kenneth this is where you talked about the culture part of the changes that need to be happened and quite frankly the computer is a tool it's a tool to get to the answer and if the tool is working if I have 1000 cores on a single HPC thing and you're telling me well I can get to a system with 2000 cores and if you containerize your process and move it over and maybe I'll get to the answer 50% faster maybe I'm not that someone has to make that decision how important is it to get people involved in these types of communities from a researcher because researchers very tight knit community to have these conversations and help that C move happen. I think it's very important that community should let's say the cloud community, HPC research community they should be talking a lot more that should be way more cross-pollination than there is today. I'm actually, I'm happy that I've seen HPC mentioned that Boots and Talks quite often here at KubeCon I wasn't really expecting that and I'm not sure it's my first KubeCon so I don't know but I think that's kind of new that's pretty recent. If you're going to the HPC community conferences there, containers have been there for a couple of years now something like Kubernetes is still a bit new but just this morning there was a keynote by a guy from CERN who was explaining that they're basically slowly moving towards Kubernetes even for their HPC clusters as well and he's seeing that as the future because all the flexibility it gives you and you can basically hide all that from the end user, from the researcher they don't really have to know that they're running on top of Kubernetes they shouldn't care. Like you said to them it's just a tool and they care about the tool works they can get their answers and that's what they want to do how that's actually being done in the background they don't really care. So talk to me about the AI side of the equation because when I talk to people doing AI they're on the other end of the spectrum what are some of the benefits they're seeing from containerization? I think it's the reproducibility of experiments so and data scientists are they're data scientists and they do research so they care about their experiment and maybe they also care about putting the model into production but I think from a geeky perspective they are more interested in finding the next model and finding the next solution so they do an experiment and they're done with it and then maybe it's going to production so how do I repeat that experiment in a year from now so that I can build on top of it and a container I think is the best solution to wrap something with its dependency like freeze it maybe even with the data store it away and then come to it back later and redo the experiment or share the experiment with some of my fellow researchers so that they don't have to go through the process of setting up an equivalent environment on their machines be it a laptop be it a cloud environment so you go to the internet download something doesn't work container works you said something that really intrigues me you know concept I can have a let's say a one terabyte data set have a experiment associated with that take a snapshot of that somehow I don't know how take a snapshot of that and then share it with the rest of the community and then continue my work and then we can stop back and compare notes where we had a maturity in a maturity scale like what are some of the pitfalls or challenges customers should be looking out for I think you actually said it right there how do I snapshot a terabyte of data it's that's that's a terabyte of data it's a terabyte of data and if you snapshot it you have two terabytes of data or you just snapshot the like in Git you do a okay this is currently where we're at so that's where the technology is evolving how do we do source control management for data how do we license data how do we make sure that the data is unbiased et cetera so that's going more into the AI side of things but dealing with data in a declarative way in a containerized way I think that's where currently a lot of innovation is happening what do you mean by dealing with data in a declarative way if I'm saying I run this experiment based on this data set and I'm running this other experiment based on this other data set and I as the researcher don't care where the data is stored I care that the data is accessible and so I might declare this is the process that I put on my data like a data processing pipeline these are the steps that it's going through and eventually it will have gone through this process and I can work with my data pretty much like applying the concept of pipelines to data like you have these data pipelines and now you have Kubeflow pipelines as one solution to it to apply the pipeline concept to well managing your data given the stateless nature of containers is that an impediment to HPC adoption because of the very large data sets that are typically involved I think it is if you have terabytes of data just you have to get it to the place where the computation will happen just uploading that into the cloud is already a challenge if you have the data sitting there on a supercomputer and maybe it was sitting there for two years you probably don't care and typically at a lot of universities the researchers don't necessarily pay for the compute time they use like this is also at least in Gantt that's the case it's centrally funded which means the researchers don't have to worry about the cost they just get access to the supercomputer if they need two terabytes of data they get at space and they can park it on the system for years no problem if they need 200 terabytes of data that's absolutely fine but the university cares about the cost the university cares about the cost but they want to enable the researchers to do the research that they want to do and we always tell researchers don't feel constrained about things like compute power, storage space if you're doing smaller research because you're feeling constrained you have to tell us and we will just expand our storage system and buy a new cluster wonderful to enable your research it's a nice environment to be in I think this might be a Javons paradise problem no you give researchers this capability you might you know you're going to see some amazing things but now that people are snapshotting one, two, three, four, five different versions of one terabytes of data it's a good problem to have and I hope to have you back on theCUBE talking about how Red Hat and Kent have solved those problems thank you so much for joining theCUBE from Valencia, Spain I'm Keith Townsend along with Paul Gillan and you're watching The Cube the leader in high tech coverage