 Hi. I'm Noonan Vasikra and I work for the Melbourne Bioinformatics Group at the University of Melbourne. And today, on behalf of my colleagues, I'd like to present the Galaxy Helm Chart Version 4, which is a chart that has been in development for a while now. And we'd like to present some of the latest work that have gone in, as well as give you some background as to what the Helm Chart is and when you might want to use it and for what scenarios it might be suitable for. And also take a look at some of the new and exciting features that have been introduced. And I look at where the chart is heading in future. To start off, let's have a quick recap of what a Kubernetes Helm Chart is. So a Helm Chart is an installable package for Kubernetes. It's much like a Docker compose file in that it describes multiple containers and how they're interconnected. But in addition, it also is deployable as a versioned package that you can deploy, upgrade, and rollback when necessary as a unit. So it makes it very easy to manage a complex application as a single unit. And it's as simple as saying, Helm install Galaxy, for example. And you can say Helm upgrade Galaxy to get to the next version of Galaxy or Helm rollback Galaxy to downgrade Galaxy. So it's a much, much more convenient way to interact and manage a complex application. The Helm Chart was first released in 2016. And it has gone through many iterations. And today it is a fairly mature package. And it has been deployed in a wide variety of situations, which we will go into in the latter part of this presentation. Let's take a quick look at how you would install the Helm Chart on an instance and get a Galaxy instance up and running. So the first thing I'm going to do is to associate it into this machine, which is a Bluetooth machine I launched on the cloud. And it has a running Kubernetes cluster that I have prefabricated. Typically, to install a Kubernetes cluster, you can do something like K3S, which will create a running cluster for you in about 30 seconds. So once we've configured and got the cluster running, we can install that good control and everything in place. The first thing you have to do is to add the repository for the Galaxy Helm Chart. So I'm going to do that. The command is to Helm repo add and the URL to the repository. So we're going to do that first, which adds the Galaxy project repository. And then let's do a little update to fetch the latest updates from that repository. Once we do that, we can then finally get to the stage of installing the Galaxy Helm Chart. So that's a simple matter of running Helm install and the chart that we want to install and the options we want to set. So in this case, I will enable CBMFS and also deploy CBMFS, so the CBMFS storage manager. And you find the storage class, which is how the persistence should be managed to be NFS. So once I run this command, within roughly three minutes, we will have Galaxy up and running. Next, let's check with the Helm Chart properly deployed. So I'm going to sit in this container, check whether the Galaxy containers are present. And we see that the Galaxy web workflow and job containers are happy deployed. Of course, this database has been deployed for Galaxy and the CBMFS plug-in handlers have also been deployed. So in around three minutes, we should expect to see a running Galaxy version with the full tool set, which is comparable to this Galaxy Star. So I'm going to switch over to the browser once it's ready and it can access the Galaxy instance. I'm now going to try and access that instance that we just deployed. And we see that Galaxy is up and running and it has a fairly comprehensive tool set. We can run some jobs straight away and we see that there are instant defunctions. So when and why should we use this chart? This chart has been designed for systems administrators who want to take advantage of containerization. So you can start off with perhaps a simple loud-of-the-box installation. It takes less than five minutes to go from nothing at all. Just basically, we have both Kubernetes cluster to a working version of Galaxy with CBMFS, the database itself, and all other requirements installed. And the chart is designed to scale horizontally as required. It allows you to have zero downtime maintenance and the main Docker image is a fairly size-optimized image of less than 250 megabytes compressed. And the chart itself continues to evolve and keep pace with the latest versions of Galaxy. So you might ask the question, why Kubernetes over other options, such as, for example, Ansible? So we think the main reason goes back to containerization itself. Containerizing applications offers significant benefits than the alternative, which is to treat your entire operating system as the oyster. Containerized applications are isolated and you know exactly what files are mapped into a container, what ports are exposed, and what inputs are fed into it. So it's more modular and comprehensible compared to the alternative. Secondly, you can avoid a lot of the conflicts that happen due to shared libraries and so on. You get better security. You can, the container itself becomes portable, so you can move it to another operating system or upgrade your operating system without fear of breaking any applications on the system. And of course, reproducibility is also a significant factor. If you accept containerization as a fundamentally better approach to managing applications, Kubernetes is a natural evolution of that. So with Kubernetes, any containerized application at some point will need to have some kind of lifecycle management. It will need to be properly distributed and managed across multiple machines as you scale. You'll need service discovery, load balancing, self-heating, storage provisioning, and zero downtime maintenance, all of which Kubernetes provides. Of course, it comes at a cost, which is that yes, there is an additional layer of abstraction. It does have a significant learning curve and it simplifies certain things but complicates certain other things. So there's no free lunch, but by and large, a lot of these problems that were beyond the reach of small teams now become accessible and a lot of the knowledge and practices become far more transferable across Kubernetes platforms. In fact, there are really many packages for a lot of the tasks that a particular administrator might want to do in very well packaged and portable formats. So next, let's take a look at some of the new features that have been added to the chart. Let's start off with the high availability Postgres operator. So an operator is a Kubernetes native way to extend its UPR so that you have a more natural way to interact with Kubernetes. So what you see on the right here is an example of that, where we define an object of type Postgres to web. So if you define a Postgres database this way, you can just define it simply in the version you want, the size you want, and Kubernetes will start to understand the type natively by dedicating it to the operator. So custom resource definitions of this type, or they're called CRDs, are implemented by operators. So not only is it more natural to interact with Kubernetes that way, operators also typically provide far more automation. So as the name itself implies, the idea is to automate or replicate some of the typical actions that an operator might take, such as replication for tolerance, even major version upgrades can be automated simply by changing the version, the operator will take care of the upgrade process. To demonstrate high availability Postgres, let's go back to our freshly installed Galaxy help chart. So here I'm in Rancher, which is a graphical browser for Kubernetes clusters where you can see the containers that have been deployed as part of this chart. And we can see that there is a Galaxy Postgres container. And if you were to go into this container and look at the logs, we can see that this is the leader database part, and there's at the moment only one replica. So this is just, I mean, out of the box, this is a simple single replica setup, but let's just scale it up by using the Kubernetes replica feature. So we're just going to scale it up to another replica and we can see that a new pod came up. And if you go into that new container, we see that it is a secondary pod and that it is following the leader pod. So with that, we now have a two replica cluster and if we go back to Galaxy, we continue to be functional. And let's see what happens if we were to delete the original database pod. So if you delete the original one, the secondary replica will immediately be promoted to the leader by the operator. And we can see that in a few seconds, it has become the leader. And in the meantime, Kubernetes has also automatically restarted the deleted pod because it needs to maintain two replicas and if you go back and check the logs of the original pod, it has now become a secondary. And in the meantime, Galaxy continues to function in between all of that. See that data is in fact. So let's go back to the Postgres replica and then scale it down back to one and then automatically the original leader will again reassume the leader. So all of that was completely handled by the operator. We didn't really have to do anything and it's just out of the box functions in the very same way. And the Postgres operator has a lot of functions that a human operator would do such as example, it can do a database upgrade or even major version upgrades are possible simply by modifying the custom resource definition. So all of that is highly simplified and managed automatically by the operator. Another new feature that has been added is bundled monitoring and metrics and visualizations. So the moment you set up a Galaxy instance of any size, you'll probably want to know what's going on with the cluster and have some visibility into what tools are running, what kind of throughput you're seeing, what kind of workflows and tools are being executed and how many users or workflows have been run and so on. So a lot of work has already gone into this in the use Galaxy star federation and we have reused that work and bundled it in with the chart so that now when you activate metrics on the chart, you will all of that data will automatically be dumped into a Instax database and you can subsequently visualize it in Grafana. The Grafana dashboards are also bundled with the Helm chart so that the moment you activate it, it'll be available in Grafana. Another area where a lot of effort has been put into is to increase the robustness and quality control of the Helm chart. So towards this end, we've integrated GitHub Actions. Alex has done a lot of the work here for transitioning the old Travis space testing we had in the Helm chart to GitHub Actions where not only is it integration tested, it's also automatically version bumped and packaged and bundled whenever there's a commit. So this makes the processor updates much faster and also ensures that we have a continuous delivery process built into it with the GitHub style workflow. A problem which has been significantly thorny for the deployment methods is interactive tools. Interactive tools have been notoriously difficult to set up. They are very difficult to, it's very difficult to get them to run robustly and scalably and requires a significant amount of effort. So with the new Helm chart, there's a lot of effort has gone into automating this process. You can listen to Alex's talk on this where using the features of Kubernetes, it has been possible to reliably deploy interactive tools across any Kubernetes cluster. Another area which has received significant attention is faster startup time while loading a complete tool set. So this has become, this has been particularly important for, and really particular where Galaxy instances are launched on demand and startup time is a significant concern. So there's been a lot of work that has gone into making these tool configurations downloadable as archives incrementally so that you have the fastest possible startup and Alex has put in some work to make this go down from six minutes to almost a 40% improvement by using bundle archive, which can be used as a customizable alternative to CDMFS. Health checks have also been improved and they've been extended to cover not just the web handler but also the web and workflow handlers. So by using health checks, we make sure that these handlers are responding as expected. And in fact, Kubernetes will automatically restart field handler. So we have this double advantage of not only knowing that the application is running but also recovering from faults. So in summary, the health chart has seen a lot of work going to it to be fully functional out of the box, particularly in scenarios which may be harder to set up traditional. So for example, interactive tools or zero downtime configuration updates or high availability to databases and so on, which require significant experience and expertise to set up normally is now bundled into this chart which can be set up in a matter of five minutes on any Kubernetes cluster. It's also designed to have no single point of failure and it makes it very suitable for these kinds of horizontally scalable production with requirements. So taking a look at the future, as the Galaxy's health chart continues to mature, we hope to see it being used in a wider variety of situations and environments. And we hope that the community will keep contributing those changes and improvements back into the health chart. And we'd also like to leave you with a thought about Galaxy itself. So as Galaxy has progressively become more complex over time and improved new functionality, we think the deployment strategies also will need to adapt. So the question is, should containers be the primary mechanism of the recommended mechanism for deploying Galaxy? And if so, to what extent does it simplify Galaxy's environment? I think this is a discussion that is worth having and investigating because it could potentially simplify the environment and lead to a reduction in complexity. The interactive tools are a good example of this. And we should hopefully be able to promote usability and portability across a wider variety of environments by standardizing on the fabric of deployment. And we think that the suitable fabric might be Kubernetes precisely because it is seeing a lot of adoption across the industry and academia. So, which leads us with the question, can we perhaps even re-architect Galaxy as a container native application? So just something to potentially discuss during GCC. With that, we'd like to conclude and thank you for your attention. If there are any questions as well as we'll be around during the conference as well. So if you need any more information or help with setting things up, then we'd be happy to answer your questions. Thank you.