 Welcome to Amsterdam and KubeCon, CloudNativeCon 2023. Join John Furrier, Savannah Peterson, Rob Streche, and the UPscot as the Kube covers the largest conference on Kubernetes, CloudNative, and open source technologies together with developers, engineers, and IT leaders from around the globe. Live coverage of KubeCon, CloudNativeCon 2023 is made possible by the support of Red Hat, the CNCF, and its ecosystem partners. Good afternoon and welcome back to beautiful Amsterdam. We are at KubeCon Europe and it is so bright and so lovely here in Amsterdam. They're actually letting me keep my shades on today. Very excited. Thank you all for tuning in for this wild first day of coverage. Very excited to welcome our next guest. He's the CTO of Minio. Please welcome Ur. Welcome to the show. Thanks so much, Savannah. Great to meet you and great to be on the show. Great to meet you too. This is your first time on the Kube. Absolutely, yes. I feel very lucky to give you the warm welcome. Thank you so much. Brave of everyone to trust me with this, but I appreciate it. So in case, you're a pretty big player in the game, but just in case folks aren't familiar, what does Minio do? Minio is a high performance, S3 compatible object storage. And when you talk- What does object storage mean? Just in case. Yeah, we are at the Kubernetes show and most people luckily know about object storage here. So we don't have to define it at the show here, but just for everybody else, object storage is basically blob storage. Large binary object storage that is different than, there are three types of storage in traditional sense, block, file and object. Block and file is designed with different protocols and different hierarchies and structures, but most of the time in traditional sense in the last 20 years or so, they were designed for inside data center. Whereas object storage combined with AWS S3 APIs meant to be for across the internet, restful APIs, warm storage, large scale, high performance with Minio. And it's meant to be for putting large blocks of storage and managing the storage and data persistency in objects, in blocks. So that's the biggest difference between a file system, a block storage, which is quite raw and file system with a hierarchy and an I know table and some technical ways of managing it in a structure. Whereas object storage is very plain in terms of namespace. There's no structure and hierarchy. It's just one flat namespace. And with Minio, you can get high performance and thanks to AWS, they open up the whole path for us to build a very high performance and scalable data persistency solution for the enterprise and for new technologies such as Kubernetes. And that's why we are here in this show in KubeCon. I mean, speaking of Kubernetes, you describe, you self describe as Kubernetes native. What does that mean to you all? So it's very simple. We started with containers. When containers were starting out, Minio was starting out as the open source object storage solution in that world. And we have written everything in GoLang that is very efficient for certain things that we wanted to achieve. And it was about 40, 30 to 40 megabytes of a static binary, the whole solution, the whole Minio server, the persistent storage stack is all that, when we put the UI in it, it became a little bit larger, about 100 megabyte, but still dramatically snow compared to what people do with appliance solutions or other software solutions. We achieved something really special in a very lightweight binary. And it was so easy for people to take that and build it into containers and start that way as their solutions. But that's one aspect. RESTful APIs and S3 APIs made it more interactive from an SDK perspective or an API. The applications could write into Minio easily. That's made it really easy for microservices, hence cloud native. And when Kubernetes came to orchestration play, it became so much easier for people to put us as a container, create a persistent layer, use our Erasure Coding, which is how you protect data. We use Erasure Coding. It's an algorithm similar to RAID, which is a data protection in the traditional sense. We use Erasure Coding to protect the data so they can put multiple containers, use Erasure Coding and Minio, protect their data and have cloud native and microservices easily talking to that and use it as their persistent storage solution. That's why it's cloud native. You've said a word that we don't hear often associated with Kubernetes a lot. You've said easy about four times. Talk to me a little bit more about that. We're all about decreasing complexity. Yeah, so Minio itself is easy from an operational perspective. Many of our customers choose Minio to dry operational costs down. And it's because- Also a huge thing right now. As fraction of an FTE full-time employee is needed to support Minio compared to other solutions, other appliance or other software solutions out there. Storage is complex if it's not done correctly. Minio focuses on simplicity and we have only done S3 protocol, S3 APIs and full stock solution for enterprise. That's why we are easy. That's why we need a lot less- You're focused. We are very focused and it's simple. Literally, you can take Minio Bindery, run a one single command and start using Minio S3 compatible storage on your laptop for a developer right away with access key, secret key immediately. So that's the reason why we are simple. Easy in Kubernetes because we have spent three to four years of engineering time to develop our operator. Operator is a framework or a concept within Kubernetes to deploy the same application or containers within Kubernetes with a logic that's built into that operator's configuration. So you can easily scale it. You can deploy multiple tenants. You can deploy a Minio cluster here, cluster there within the same framework. You use the worker nodes from Kubernetes and PVs, which is the storage, raw storage that Kubernetes uses. Easily deploy it. That's why we spend three years of engineering time to get our operator right so that we can deploy into Kubernetes. Good on you. Hence people here pay a lot of attention to Minio Boots because they like that idea of using an operator to deploy their persistent storage solution. I believe that and you've certainly, it seems like you've really honed your market. I know some of the world's biggest companies use you. Can you tell us a little bit about your customers? Yeah, we have more than 260 customers at this point and a lot of it goes from traditional use cases where people who are doing video streaming like we're largest streaming and people who are into. Sorry. It can be any of them, they're in competition. So from streaming to Hadoop replacement stories we have different spectrum of companies that are using Minio. So Minio can be used as a traditional object storage and Minio can be used as a data, modern data pipeline in a Kubernetes environment. So we have different workloads and different use cases and within Kubernetes with AI ML workloads especially if people wants to separate compute and data if you're not into the classical Spark and Hadoop workloads and you want to have a query engine that is separate than your data. They come to Minio to create a modern data pipeline. So a lot of our modern AI ML workloads are running on Kubernetes with a data pipeline that uses Minio. That's awesome. Let's talk a little bit more about that. What are you seeing in the data landscape right now? It's kind of a wild time. So data landscape is dramatically changing. The whole buzzword in the last six months has been chat GPT and that itself is kind of the tip of the iceberg with AI ML workloads where you have a lot of training and data sets going that the data sets are growing incrementally. Like it's just so out of control to a degree that now it's not about the amount of data but it's also how to get that data uploaded or put into the data storage systems. So at scale, so Minio focuses on performance. We have done a few tricks with the chip sets. We use something called SIMD instruction set which is available in most chip sets, AMD, Intel, even ARM. We compile the code for that. So we use special registers on the chips to get our erasure coding fast. So CPU. So it's down to the system hardware level. That's awesome. It's flags and it's available on all commodity chipsets. And we use those to have a high performance erasure coding scheme in our software. And the CPUs are ideal for throughput. So object throughput essentially. The client data, application data. When you combine that it's easy to do performance at smaller scale. But in AI ML world with chat GPT and training data sets and the models are small in essence in terms of data sets. But when you train a data set you have to take multiple checkpoints and the output of that becomes about the same or a little bit less but you keep multiple track points and that exponentially grow the data set that you are getting. So that is the biggest shift in that data landscape. And we are benefiting from that because we store all of that data. Role or otherwise. And the other aspect of it is we do performance very well at scale. A lot of people- You're all about scale, right? I mean- We're totally on scale. I mean object storage is about scale. Minayo is on hyper scale because we did these tricks in the CPUs. Scaling the people scaling. Exactly. So when people in AI world, AI ML world when they want to upload these large data sets and have multiple checkpoints of that, multiple iterations of that they want to upload that fastest. And we are the best there. One of the competitors- You say the world's fastest by the way. We are the world's fastest that's for sure. And we have benchmarks and we are open about it. We published it on our blogs. It's there for people to see it. Love it. Look at it. I was just going to ask about it. We challenge our competitors to use the same hardware and publish their results. We're all about that. Because we are very confident about the low level things we've done at the assembly code level to the- The best of the time. To make the best product. We did day one. Like SIMD instruction set in Intel chipset is called AVX 512. That flag is allowing us to use these special registers to be fast. So we are quite confident on that. And the data landscape changing with chat GPT and AI ML data training. All of that data models and training the data sets. We are in the right place. And we are getting the benefits of that. We are working with one of the competitors of chat GPT is testing on Minio trying to get a big data set right on as fast as possible to the server side. To the Minio storage. To the S3 storage. So that they can do multiples of them and get the training as fast as possible. So the game is to get those data sets and get the checkpoints and data pipeline up to the storage system as fast as possible. And we play a great role there. And we are the best in that market. I mean the speed is so important on both sides. You see it with people using chat GPT they get frustrated if there's lag time. Yeah that's the front. We're talking about seconds. Yeah I mean then that's on the front end. That's the front end. If you have you know and but everything we're going in on this end is getting stored. So that you know it's this two way street all at the same time. It's pretty interesting. When chat GPT came out and I love that you're working with one of the competitors. Did you feel did you feel excited? Did you feel validated? What did you guys think as a team? No we we got we felt so good about the fact that the data models and training that data is becoming mainstream now that everybody has to do that. And that creates a huge market for us. I mean that's the that's the part that we got so excited and got well dated because we were always about large amounts of data and storing. But it's easy to take archival data and store it because it's low performance. Yeah. That's not what we do best. We do what we do best is large scale high performance data. Right. And when chat GPT came we saw okay that's only text you know language training and there's the voice component to that. There's the image component to that. There's going to be more people going into more bots and more AI ML training of data for different use cases. And that's going to explore the amount of data sets that's going to be stored. It already is. It is already there. But that's what the validation is. That was the front end validation seeing it and middle of the road common use case. Everybody loves it. Everybody uses it. And there's going to be more of them. And we are in the back end of that. So the future is bright for you. Very bright. Fantastic. Or thank you so much for being here. Men IO, check them out. This has been fantastic. And thank all of you for tuning in. We are here in Amsterdam at KubeCon Europe. My name is Savannah Peterson. I'm here with theCUBE, the leading source for high tech coverage.