 Hi, and welcome to our tutorial on why Kubernetes needs object storage. My name is Daniel Valdivia, and I work at Engineer here at Minaya. Over the next 15 minutes, I hope to show you the value of interring object storage into your Kubernetes ecosystem, not only for your applications, or also to support your operational workflows. I want to start by talking about the same biases between applications and operations. We've seen that Kubernetes has enabled a great plethora of ways of deploying applications on top of it, but not only that, it's also simplified how operations actually it's deployed and operated around these applications. The same biases between these two areas of IT has come in the sense that operations is providing services and infrastructure that the applications need to run smoothly. But not only the connection goes in this direction, applications are also providing a set of data and metrics that operations can collect in order to monitor and improve the way things are running. The ecosystem to support both of these approaches is very ample. But for this conversation, I only want to focus on the operations side of things. We've seen in the last few years that technology megatrends come in pairs. We saw this with the introduction of Hadoop MapReduce. This general-purpose computational framework was introduced back in the early 2000s and it came paired with a nice distributed file system called HDFS. We saw that from now on, this was the approach to actually deploying applications. So there will be a compute layer and a pairing storage layer. For example, we saw this with Nova and Stripe. Also when EC2 was introduced, it was nicely paired with S3. But since the introduction of Kubernetes we have not seen what is the right pair for Kubernetes. But we know what the right pair for Kubernetes is and that is object storage. We say this because object storage is uniquely positioned to make your communities applications and workflows scalable in both directions, both vertically and horizontally. So there's some guided principles that modern object storage needs to fit in order to satisfy your needs on top of Kubernetes. First, it needs to be performant. It needs to be extremely fast and it needs to be able to run on top of hard disk drives and choke on those drives. Or if it's running on faster NVMe SSDs, you should be able to saturate the network. Not only that, if you have very fast and performant object storage, this will enable a whole new world of workflows, such as artificial intelligence and machine learning workflows, also to big data and analytics type of workflows. It needs to be scalable. From terabytes to exabytes, you need to do it in a way that seamless and doesn't disrupt operations. Modern search solutions can no longer tolerate object storage that needs to rebalance every time you are adding capacity. It should be software-defined. You cannot have all these cloud native applications that really don't care the type of hardware running on top of. So it needs to be paired with a software solution for storage that can scale and really doesn't care about how it's actually been run. It needs to be simple. Simple is hard. In its dedication, in its scoping, it's very important because when things are designed in a simple fashion, they're easier to operate, and they are also easier to scale. So the proof of this, you can see it in the ecosystem. If we were to place object storage at the center of the ecosystem and one of the popular object storage solutions out there is Minio, you will see that there's a set of infrastructure services that can leverage object storage right out of the box, but also all these type of application frameworks that are now are coming into place, and they're built for cloud native applications, that they can leverage straight from object storage. So I want to talk to you briefly about the structure of this demo and how it's prepared. The first thing is assuming that all of this is running on top of a Kubernetes cluster, right? So we're going to assume there's a Kubernetes API here, and we're going to also make the assumption that some CI-CD pipeline is actually producing some artifacts. In this case, I'm going to have a CI-CD pipeline that produces a Docker image, and what I'm going to do first is I'm going to have it deployed to a hardware private Docker registry. I'm going to have hardware actually store it on top of object storage. This way, I'm not constrained by limitations of individual persistent volumes that I'm attaching to my hardware instances, but I can also have hardware focus on the parts that are relevant to access control, and these other nice features the hardware has, but I'll delegate storage to Minio. Additionally, I'm going to assume there's an application running on this cluster, and then for this application to start, now they can actually pull the images straight from hardware and hardware in turn will actually retrieve the image from Minio. If this application, let's say, has also a database, right? I'll be deploying a Prometheus so I can collect usage metrics from CPU memory utilization to number of API requests. I'll be collecting them normally from this application, but I can also have Prometheus collecting these metrics straight from Kubernetes API. However, there's so much data you can fit into persistent volume again. If you are trying to go for the long run, you need to store all these metrics for the long term, and this is where Thanos comes into place. Thanos is a high availability deployment of Prometheus that can actually collect all these metrics and back them up into Minio. Not only back them up, you can also compact them and query them straight from Minio. Lastly, let's imagine additionally, some of these applications may have persistent volumes. Let's say you have to backup your cluster. Backing up your cluster or snapshotting your persistent volumes, this is where Belero comes into place. Belero has the capability of backing up all the resources in your cluster, including your Kubernetes API configuration, and snapshotting individual persistent volumes, and pushing all that data into Minio. This allows again, you can start putting all these data into Minio and start scaling the Minio independently without disrupting these applications. They can keep running forever on modified, and you can just keep adding space, or you can even start controlling how the storage happens on Minio. All right. To start the demo, I'm going to create a new, brand new Minio Terran using one of the examples on the Minio Operator repository. This custom resource definition allows me to quickly define a brand new tenant with a specific capacity, and all I have to do is simply apply my tenant definition. After I have applied the definition, I'll see that my tenant gets instantiated and initialized. This will in turn create a couple of state resets, and additionally, a service gets created. I've taken the liberty to set up an English controller to this Minio service that was defined for my tenant. This will allow me to open a new browser, and go to the other side, specify my English controller. It's a convenient way to see my buckets and my objects that are stored on my object storage. As you can see, I already started by creating four buckets, one for hardware for storing my Docker images, one for Prometheus for storing all my long-term metrics, as well as for my TANOS rules, and a bucket lastly for my Bellero backups. I already installed hardware as you'll be able to see, and it's already properly configured. I'm not going to go with this project. I'm going to create my own new project, and let's call it demo. I'm going to make it public.quota and hit Create. Now I'm greeted with a brand new project that has no repositories. I can see it already offers me an example of push commands which I'm going to oblige to. Let's tag some images. Let's say based on this recommendation, I want to tag a local container image that I have here. Let's see. The latest Minayo released that, and I'm going to place that here, and it seems to be only one level deep. After I tag successfully the image, I'm going to first log in into my repository, my hardware repository, and after that, I'm going to perform my push operation. As easy as that, I'll just push my Docker images into my hardware repository, and after the image gets pushed, if I go back to my Minayo browser, I can quickly refresh and see that now my image was pushing to my artifact and I can pretty much see. If I go back to my Minayo browser, I will also see that the Docker image itself got pushed to my registry. When it comes to monitoring, I've already set up Thanos using the manifest from the repository they mentioned earlier on. What this did was create an in-space called monitoring. I'll be able to see all the staple sets that have been installed. We have three staple sets for this deployment, and all of them have been properly configured with the proper startup command. Just to exemplify what I'm talking about, let's look at the configuration of the Prometheus staple set. Here we can see that all I really did as custom configuration was passed a specific set of object service configuration for my registry store, and I also mounted this configuration as a volume from the conflict map. With this in mind, essentially, what's going to happen is every time Prometheus is compacting some of the images, Thanos is going to grab those images and store them back. Here what I'm just going to do, I've configured Prometheus to have a very aggressive log rotation period, so every five minutes it's actually sending files out. If I go to the Thanos logs, we can see pretty much Thanos doing the effort of actually putting the objects into the object storage periodically. Every time there's a new log, Thanos pretty much takes it and pushes it into my Prometheus long-term bucket. Let's see how that information looks like. We see here that every five minutes, a new set of metrics are being collected by Prometheus from my Kubernetes cluster and my applications, and at the same time, Thanos is also taking those metrics that are being rotated every five minutes and storing them in a long-term bucket. Lastly, let's see Belero. To start a backup on Belero is pretty straightforward. You will need to get Belero CLI. With the Belero CLI, all you have to do really is indicate that you want to perform a backup of your cluster. This will in turn scale a backup to run with your Belero. If you go into the Arbelero pods, we'll see traces that Belero is actually performing a full-bomb backup of my PVCs. For example, I have plenty of volumes for my Prometheus deployment for its databases, also for hardware is using some PVCs for its own database. Pretty much this is snapshotting all these volumes that I'm using. Along with this, the backup will be beamed down to my object storage where I can actually see the backups that I'm performing. If I were to uncompress this file, I will see that there's a lot of interesting information in this instance. For example, I can see a full backup of all my resources on my system, and if I were to explore, let's say my set of stateful set applications on all the namespaces, I'll see my hardware stateful sets, properly installed and backed up, as well as the tenant for my minire. As you can see, all these operations frameworks can actually rely on object storage because it is at its core of these operations. It simplifies the concern of what if I run out of space and it makes it extremely scalable. You can take these deployments from two gigabytes to terabytes to petabytes to exabytes. The sky is the limit. Thank you for joining us in this conversation of how community needs object storage. You can join us at any time on our Slack. If you have any questions or a github, you can get started by bringing our documentation to our blog post. Don't hesitate to reach out. Our community thrives on individual collaborators and we wouldn't be here without our community. Thank you for your time and I hope you have a great future.