 Good afternoon, and welcome back to KubeCon CloudNativeCon here in the wonderful city of Chicago. My name is Savannah Peterson, joined with my co-host, John Furrier. We are ripping through the morning. It's been really exciting, and I'm pumped for our next conversation. Before we get there, John, what's your highlight of the morning so far? So far the interviews have been amazing. We've had Red Hat top people come on. I love the edge conversation. Sally from Red Hat is so smart. What a firecracker, right? I love the operating system angle and the combination of data, operating system concepts, platform engineering, and how to use data with the AI wave over the top makes this the perfect storm of innovation happening of all the KubeCon's I've been to, Savannah. This is probably the one that's kind of clear in the runway on the cruising altitude. To get into the clouds, if you will. So I think- It's going to get real punny today. I think it's really a good meturization of the ecosystem. And even with the economic climate the way it is, the AI wave and the booths here are showing signs of just nonstop growth. I mean, I think it's going to be a great day. I couldn't agree more and actually want to start by talking about booths. First of all, welcome, Daniel. Thank you so much for being here from Minio, our friends. You guys are always on the show actually. I feel like anytime we're in the same space we always invite you because you're all so interesting. You and I were chatting about your booth and what it's like hanging out. We've noticed the buzz and the energy in the room. You were saying that there's a lot of traffic. Talk to me about the energy. What are some of the conversations that you're having? So it's really inspiring to see all these people coming to our booth and saying, hey, I've tried your stuff, right? Because a common story of how actually Minio gets adopted into companies is it usually starts with the developers. They start like, okay, I run Minio on my laptop. It worked. This is great. Maybe I can take this to production, right? So they just deploy it as another application in their cloud native environment and it just works. And slowly starts creeping into the organization, right? This is where the shadow IT is kind of like bringing what they need. That's a bit of software defined, right? So they come and tell us these stories of success and we're very excited. And sometimes they even come with their technical questions and we're providing free customer support on the spot. So it's been quite definitely. It's a win-win for everyone. It's a win-win for everyone. It's definitely, yeah. For everyone coming by our booth. Is there a customer use case or anything that you've learned about at the show this year that you didn't know about previously or something that surprised you or maybe inspired you a little extra? No, but we're seeing more of the same topics that we've been seeing over the past year, right? With the nascent of AI. We're seeing all these AI data lakes. And so they're coming with all these use case and telling us, you know, we're now running these AI clusters because we are training on the cloud, right? So we have our data on our own data center, right? The massive data lakes because now they want all these data sitting in one spot so that AI is going to leverage it. When it comes to training, they want to leverage the cloud GPUs, right? So they're spinning these GPU edge clusters and they're telling us all these gorgeous use cases that we were like, okay, that's the great creative way of using Minioh. So we're very happy to hear in those cases. You mentioned data lakes. How important are data lakes right now? They're crazy important because the industry has been coming from this trend of I have a storage technology for this data set. I have another storage technology for this data set. But now it makes more sense to consolidate that into a single data lake, right? So this is where object storage being primary storage is actually shining. So because companies are finding out this is cheaper and easier, we just have all the data in one place and then we just start leveraging to our AI tools or big data analytics tools. How's the AI wave impacting you? Since we last talked, we were at the open source summit in Vancouver. We have a great chat. AI is continuing to surge in terms of the value of the data. You guys are in that business, storing unstructured data of Kubernetes. What's been the revelation since then with how AI is changing the value that you guys can extract from the data and customers? So we're pleasantly surprised to see that we were in the right spot at the right time, preparing over years, making sure every single machine learning framework supported object storage natively. So now that AI is coming along and AI needs to leverage massive amounts of data, right? They need to train at very high speed. Only object storage can deliver that, right? There's nothing that traditional storage can deliver, especially at the scale that these data sets are coming for AI. So we were surprisingly in the right spot at the right time. So we are actually being delighted by now the industry waking up and noticing, oh, we actually need to embrace object storage because that's the object storage of the cloud. It's a great example of being in the arena, doing the work, being at the right time. The way fits. Now what kind of value are you guys extracting from that data? Because this is a great, again, one of the best use cases we're seeing in business right now is getting value out of the data. What kind of value you guys see extracted from this? So that's very interesting because at least from the customers that we're seeing is most of them, like 80% of the data is machine generated data on structured data, right? And companies like keeping this data around because right now they don't know what to do with it. But eventually they're like, oh, I want to go back and figure out this metric or figure out this insight from this data. And now it's even possible if they kept all the data in their data lake. And now they have LLMs to go on, okay, go analyze through all these logs at very high speed. I need the storage to be able to string me this, right? So the LLM can be telling me the LLMs can do amazing things now on this on structured data. They can be like, and read the log, and be like something that no human will have anticipated before. Like this log is sneakingly trying to do something you were not anticipating, right? So it's very important to keep all this data on a single one place. So now these technologies are actually making the data shine. So one of the things we're preparing for Savannah, SuperCloud 5 and our studio coming up, that brings up the whole multiple environment and conversation. You got Reinvent coming up, Show, you got Microsoft Ignite, OpenAI just had their demo day on Monday, which is very inspiring. Super computing next week. Super computing next week. So chip innovation, model innovation with OpenAI is so inspirational. It felt like a Mac-like event. People clapping and cheering. Not your typical boring keynote, but it was really strong. Microsoft's going to have to get a tailwind with their products. AD Invest is going to probably launch a boatload of new things. The cloud guys are coming in. The open source community is growing like a weed. And it opens just, how do you see this market? What's your take on this? Because you're in the middle of all the action. How does it all come together? What's the AI impact? So this is where the training that we were seeing people going into the hybrid cloud model, right? Because Kubernetes is enabling this insane portability into like, now I don't need to struggle into deploying the same application on-premise or on this cloud provider and this other cloud provider. So now I was mentioning, now that people are like, okay, I have this data on this location, but I need to compute, right? Because right now the computer is a scarce resource. I need those GPU accelerators, these tensor processing units. So now I need to be able to quickly spin up a stretch on this different cloud provider. But how can I do it? Should I rewrite my code to be like locking with a vendor? Or can I just use the same API, right? And this is where Minayo is actually being supportable, entirely software defined. It's actually making us really embrace this trend of when people need to go and embrace a big player, because let's say this one cloud has all the GPUs right now. So I'm going to go and crunch my models on this cloud, right so, but I want the same API. So when I tear down this compute cluster and I go to the next cloud provider, right? I can spin it up again and keep it in a consistent fashion. I don't need to update my algorithms or anything. You've been with the team for over four years. We're now kind of potentially at peak hype curve for AI. When you started at Minayo, did you think that you would be a part of the solution for AI at that time? That's a very interesting way to put it, because even before I joined Minayo on my previous startup I was head of machine learning and I was doing all the machine learning locally on top of Minayo, right? Oh, so that makes sense now, a little bit of the origin story, they all love that, yeah. It was the right way to build all the spark pipelines that we were building, all the TensorFlow algorithms that we're training on. They was all on top of object storage every time, because I knew there's no way these data sets that we're using, we can just keep shuffling them around. Or even when you're developing as a machine learning engineer, pulling the data set just to test if an algorithm's going to work, it's really slow. So it's better if you can actually use it locally, right? So I was already like a big Minayo fan, so joining the Minayo team just made it even better. So, okay. What a genuine endorsement, I love that, that was great. I told you, Savannah, he's a high band with conversation here. You can warn me he was smart. So I got to ask you, so what's your position vis-a-vis neural network versus probabilistic LLMs, large language with hallucinations, a feature, not a bug, and then the power law that we published around specialized models coming in. So you got the models have big, fat ones, then the long tail comes in, specialized models, but then neural networks is a different architecture combination than say, the other point. So how do you see that coming together? So it's very interesting because LLMs came to change the landscape one more time, right? So, and you see these two studies emerging in the LLM market space, right? You see Meta being like, okay, I'm going to go capture all the developers by bringing all these LLMs so that experienced engineers can go and build amazing apps on top of that. And these are the kind of engineers that are very familiar with neural networks and they can connect them, right? So the language model will solve some of the problems that I had and I'll build my domain-specific solution on top of a neural network and interface them because LLMs have a unique advantage in that sense. And then on the other side, you see OpenAI being like, okay, I'll grab everyone else that's non-technical, right? I'll let them build GPTs now with natural language and these are all people who want to get into the AI party, but they don't have the technical know-how. No code, no code, yeah. They democratize the AI. Yes. All right, so then the next question is, okay, what happens next in the infrastructure? Because again, we're in Kubernetes land here, so this is all moving to a distributed environment. We talk about the operating system. It'll go there, it'll be all self-driving, it'll all be self-buildable with AI. But AI itself, in a way, needs its own operating system. Like, if you look at what neural networks are and LLMs with data and storage relationship that you're talking about, you almost pontificate this idea of, isn't there like maybe a new operating AI system that's completely different? So I see object storage as the operative system of AI because pretty much the data sets are in object storage. The models end up being in object storage, right? And when you start serving them, you may actually be spending hundreds of servers for inference and all these servers are pulling the model from object storage, right? So it's the one part that actually makes it simple because today you're building your models on TensorFlow, tomorrow you may start building it on PyTorch and you don't know what framework might come after that, right? But then what remains constant is your data and your models, right? So I see object storage being portable across all different kinds of cloud infrastructure in that one operative system. Okay, so there's now multi-cloud moves through a whole nother conversation. What's that use case for multi-cloud? You got latency, you're going to have, have data layer in there. You have to have compute, TPUs, GPUs, TPUs. How do you see that? Because that's what people are working on right now, okay? If I can go to Azure and get some of their GPUs or Google and they got TPUs, how do I write a programmatic scalable system? That's one of the use cases that I was amazed by listening in the booth, this company is spinning all these edge closers. So I'm caching all the data on Minayo and the data sets because now they're going to do this large training on Azure, right? And they were, I was like, that's brilliant. And then at the end of the training, this means that every time they need a file for the data set to train, they just load it once into their edge and they just train like crazy. And at the end of the training, they just destroy everything. And then it's like nothing ever happened, they have the model and they bring it back home. And they started serving inference on top of that. So with these very creative use cases, right? We saw that- Yeah, that's cool. Yeah, yeah. We saw that starting with the self-driving car, right? They were running Minayo in the trunk, right? And then the car was recording all the sensor data, bringing it back into the garage and then copying that into their local edge cluster and then the edge cluster bringing it into the cloud. So we've seen this train coming from the years where people really need versatility from software-defined storage, right? And software-defined storage is pretty much object storage. So the intelligent edge goes a whole nother level. Yes. So obviously Minayo providing a lot of solutions for customers on the floor here and people around the world. What's next for y'all? So right now, because we're a company that releases every single week, so we are always working on top of- Every single week is strong, by the way. I just want to give that a moment. Releasing every week is bold. I love that. Thank you very much. Even as fast as your customers want to move, that's the spirit. And that's the thing, every time we fix a problem for one customer, it benefits everyone, right? And that's how we like it. So right now, people are asking us, oh, what are you guys announcing at KubeCon? Well, you know, since we release every week, every time there's a new feature, it just goes and then the next- You're making an announcement every week. You don't need to wait for events like this. Exactly. So we did have a big announcement in Barcelona at VMworld where we show how now the new data services include Minayo. So now you can actually launch Minayo from vCenter itself. And so a large, very large cluster for big data. But regarding what's next for us, we're bringing a set of new products to actually simplify the operation of object storage on the edge and of different models that are actually very exciting. And when we're ready to show it, we'll start showing it. Cool, we can't wait to see it. Daniel, you're like a masterclass here on theCUBE. Really appreciate you. Question for people watching. How should they think about preparing their data sets? They got models, they got their data. That's clear. It's a competitive advantage to have good data, clean data, makes AI better, obviously. What advice, what best practices, what have you observed as a way for people to start rethinking around, how do I organize myself, whether it's my data, or how do I look into the models? What are some areas you could recommend to start thinking about? Well, in my experience, what I've seen is that not a lot of companies go the extra mile into pre-processing their data properly. They waste a lot of, like they spin up the GPU instances and then they spend time pre-processing the data just to fit it into the GPU. Whereas they could have a stage just to pre-process data properly and put it in object storage and then speed up training because object storage can let you read the data as fast as the GPU can pull it. So you definitely want to be in a position where you're going to be spinning up this expensive hardware, you're ready to go, right? So that's probably the biggest advice I will give people when running machine learning pipelines to properly structure them so that when you're about to start a particular stage if it's training or just serving, you have everything ready. Talk about vector databases. Why is that so important? Why is it such the rage? And now you're starting to see everyone come out with their own vector database and we've said on theCUBE it's a feature, not a company, but people are putting vector databases and embeddings next to data stores. What's your view on that? Why is it important? And how do you guys look at this new vectorization of these new embeddings? So, vector databases exploded in popularity, especially with large language models because you have these large, very large amounts of data on structure and structure and you need to figure out the way to actually retrieve it without knowing what you're looking for, right? This is where vectors come into place and encode the data. But these are very large vectors, right? So very large vectors and pretty much you need a place to actually place these back to these indices, right? So you'll see this train where most vector databases out there, they're built with Minayo in mind, right? Or backing up to object storage. But we see this train because now companies are actually, okay, let me load these 10 years of logs, right? And then I need to find that needle in the haystack with an LLM. An LLM makes it trivial to find that needle in the haystack. So now, but you need to have all those logs into a vectorized format because pretty much you're not looking for words, you're looking for concepts and a vector database or a vector representation of an information makes that trivial. Awesome. Can we have him explain all of the complex ideas here on the queue? What is Kubernetes? No, you didn't. Hey, that's how I ended up here. Doesn't matter anymore. That's how I ended up here, so it's all good. No, that was great. Daniel, wow, you are just full of insight and such a fantastic guest. Thank you so much for being here. Thank you for having me. We're looking forward to your happy hour tonight. Thank you for always bringing the fun and the education to theCUBE. John, thanks for the great questions. What an insightful session this just was. Fantastic. So I just have one question and maybe the audience will appreciate I asked this. Yes. Aren't those earrings heavy? You know, they're not too heavy. They're here, I'll let you feel. They are an actual small Rubik's Cube. So you can see. Okay, oh, they're actually pretty light. Yeah, yeah, show the audience. You can give it a little twirl, but yeah, they're light, it works. They're just many cubes. You can solve it right there. That's what hasn't been solved. Yeah, the other one hasn't been solved. Yeah, I'll make sure not to... I don't mind, I certainly don't mind. Well, I love it. Daniel will bring his Rubik's Cube skills to our next session, maybe at SuperCloud 5, just named the battle for AI supremacy. Very excited for that coming up and always excited for continuous conversations with Minayo. My name is Savannah Peterson. We're here in Chicago at KubeCon, CloudNativeCon. Thank you for tuning in to the Cube, the leading source for emerging tech news.