 G'day everyone. My name is Matthew Erickson and we're going to take a look today at how we can back up our Kubernetes data So are we talking about backup when containers are stateless? Well, there are two main reasons stateless applications eventually need to be migrated for life cycle events or Kubernetes version upgrades and stateful containers require protection of their persistent data effectively we're planning for the failure of hardware software or even just plain old human error So before we get into the mechanics of how protection actually works it's important to consider all of the sources of data outside of your Kubernetes cluster as well whether that be the sandbox development environments for your developers your container image registries and Helm charts to the eventual Writing of your persistent data down to your storage subsystem, which I've shown here as Rook and CSI That could be any one of the software defined or traditional storage of vendors available today So what is an application? Well, that's a little bit of a work in progress the application CRD program is working on recording the dependencies between API resources or objects But today you can safely assume that an application is a collection of API resources and optionally some persistent data So let's take a look now at the two use cases or the two methods by which you can perform protection today Looking at data-centric backup. There is a custom operator typically that coordinates the scheduling of volume snapshot and then replication to an alternate or secondary storage array Which you can see at the bottom there and that custom operator provides resources that allow both scheduling and retention of those snapshots The recovery in that use case is taking those snapshots or snapshot creating a volume out of that and then rescheduling the application on the remote cluster using developer supplied YAML or application manifests The application-centric version of protection has one important distinction which we can see in step one here in step one We are collecting the application manifests or config so that both the config or Application metadata is recorded along with the persistent data in step two This method allows us to then go and recreate the volume And then reschedule the application in step two using all of the recorded application metadata and this allows us to recover that application to any API compatible cluster So which method is best? Which one should you use? As always it depends if you are orchestrating your infrastructure from Kubernetes and You're using kubectl or GitOps to manage your infrastructure then the the data-driven approach offers a vastly enhanced protection capability because you can leverage things like change block tracking and Replication that are not available via the CSI today If you have centralized operations teams that are performing life cycle or centralized backup and recovery for traditional and Containerized apps then the application-centric model may be more appropriate for you So moving on if we look at our harbour data now We can see that the the screenshot in the bottom right here indicates that even harbour the harbour project Recommends performing backup of your harbour instance before you do upgrades. So replication is not a backup here you should absolutely have a method for taking a copy of your harbour application config or metadata and the underlying data as well So what's happening now in the industry, you know, you've got you've got your existing applications completely under control and protected What else is happening? Well, there's lots of enhancement going on with the CSI so that CSI will eventually be able to support things like change block tracking or consistency group snapshots So lots going on Definitely get involved in the community Stay on top of what the data protection working group are doing there and bring your knowledge and experience in providing data protection to that group Enjoy the rest of the show. Thank you