 Welcome back, everyone, to SuperCloud 4, our fourth episode, the topic this episode is generative AI and the impact. Obviously, SuperCloud is next gen cloud, multiple environments, large scale. The role of data has been the topic. Ben Koos is the CTO of Box, is here with us. He could not make it in studio for our live in-stage forums is here. Remote, Ben, thanks for coming on. Remote, we appreciate your time. Happy to be here, thanks for having me. So the role of generative AI is all the rage, obviously machine learning, MLOps have been around for a while, we've seen a lot of supervised supervised machine learning over the decade. Role of data, Box is not a stranger to that in the enterprise. Enterprise data can be kind of hairy, you got enterprise surge, how do you all wrangle it again? A lot of tech there. Now with generative AI, a gift has fallen in the industry. You're in the middle of it, you're doing a lot of work there at Box. How are you looking at this? How is it impacting your company and your customers? Oh, you said it was a gift and completely agree. So the power of the new generative AI models is just wonderful, even compared to the previous models, which were also very awesome, it's already the last 10 years of all the advancements in machine learning and neural networks. And then a lot of things became possible that weren't really possible before, but with Box and with many of our enterprise customers, one of the big concerns, of course, is this unstructured data. It was just always so hard to do ML and unstructured data. And since like 90% of the data in any organization is unstructured, audio files, video, doc files, PowerPoints, PDFs, all this really valuable data these companies have, it was just always so hard for them to be able to process it. Like the way you process, like if you have your data structured, you can do a lot of wonderful things with it. But if it's unstructured, you kind of have to read it. And then when the new generative AI came out with these new wonderful large language models, suddenly you could actually have the AI not only sort of understand things, but also start to help you create some content. So this has just been really eye-opening for everybody. And in particular, I think many large enterprises are looking at it and they're saying, wow, like what's possible now is different than what was possible a year ago and in their minds of sort of production class AI. And in many ways, I think a lot of companies are still thinking about like, what can we do with this? There's a lot of interesting use cases of many people are adopting, but still like almost every day we talk to customers, they're still thinking, what more can I do? And they're seeing that the rapid evolution of this AI is continuing. You know, one of the things I really admired over the box over the years, since the early days of the founding with Aaron and the team was they made things simpler in an era where enterprise playbook was to solve complexity by making it more complex. Just the classic enterprise playbook, right? For if you're a vendor, get them back in. You guys also took some territory down in a competitive enterprise market where there's a lot of muck and toil and undifferentiated heavy lifting. Database is now the advent of unstructured data coming in. Now with IoT and everything's connected, there's a tsunami of devices, analytics at the top of mind. But the budgets aren't increasing as fast as the data increase. So the problem is, okay, how do I create value? And this is a technical problem, not just the apps. So I got to write apps for the business, but I got to run IT and platform engineering at the same time. How are you looking at this as the CTO? Because you got to wear both sides of that hat. You got to look internally and make some innovation happen to enable that app value in the front end for the customer. Yeah, so, I mean, our customers will tell us straight up. They'll say, I need to be able to prove that this helps me in my business concerns. Of course, the technology is totally awesome. And of course, there are many different aspects that everyone wants to talk about. But at the end of the day, when it comes for them to pay more money, they're going to want to make sure that it is giving them value. So things like making sure it drives productivity, making sure that it is helping create things that maybe were not possible before. Those kinds of things are sort of the number one, the sort of the business value conversation that our customers are having. So for us, we look at that, like being able to take the AI and apply it to their data, the content. This sort of begins the conversation of, okay, now how, if we start to do that and we start to do more of it, then how can I make people more efficient inside of my organization? Or how can I achieve things faster? How can I get to market faster? How can I do all these tasks that everybody does kind of every day, but then just do it either faster or better? And I think that's where most of our customers' minds are at right now. I want to get your thoughts on and reaction to a statement I'll make. And I want you to look at it from the perspective of a pro AI perspective. You got a couple of things emerging. You got the AI wrapper apps that just wrap around some of the large language models, feed it some data, very well engineered on the prompting side. I'm over simplifying it, it's simply like, I would call it like building a website on top of the web. Okay, you can do it, it's a great app, people use it, great. I'm a fan of that by the way, I was kind of poo-pooing it before, but I think AI wrappers are not a bad thing at all. Secondly is native AI apps. And then third, AI systems. Okay, so three kind of categories of things emerging. What's your reaction to that? Do you agree? Would you elaborate? Do you see the same thing? What's emerging? I mean, what's the spectrum? Yeah. You have the system, the native and then the wrapper. Yeah. So I think, so for the idea of the wrapper, which is to say like, the AI has certain capabilities, why don't you just expose it directly with sort of minimal, maybe other value propositions around that? Like in many ways, again, I don't think there's anything wrong with it. There are many companies who exist because their job is to make things better. Like in it matters a little bit less if it's like the revolution, crazy technology versus like just, they just focus on solving problems really well. From our view, like one of the things that is critically important is that you have to have, not just solve the AI problem, right? Maybe a long time ago, like AI was a competitive advantage, but even the fact that we're talking about like a wrapper, AI is everybody has access to the same fundamental technology, like which is either online via OpenAI or via Google or via any of the wonderful vendors that are out there. And so it's not that hard for anybody, big companies, medium-sized companies, small companies to just start to use it. So what is the advantage that you have on top of that? And so of course, the way we look at this is to say the value of content overall is to make sure that you can apply AI to it safely, securely. One of the big concerns with some of these things is that you have to be a little careful with like security permissions. Like if you're not careful that AI itself becomes like a source of data leakage, which especially when you talk about bigger enterprises, this becomes a big challenge. And so I think this is where you start to see that you want, for companies to start with a really cool proposition of the wrapper, they have to either build out the rest of their product offering or of course other companies are always looking to add AI to what they do. So on the system side, do you see any impact like a new neural network, LLM, data APIs? I mean, is there a paradigm shift on the far end of the spectrum that's emerging that you see? Because we're seeing the conversation range from, okay, low hanging fruit to a lot of experimentation. Some production mostly through some tooling and some picks and shovels helping people get some stuff into production. But there hasn't been a lot of conversations around. That's large scale, that's in production. That's a full big workload. We're not there yet. Is it going to be there soon? Do you see the progress happening faster or are there good stuff in production for workloads or is it still progressing? I think it's kind of all the above, right? Like you start to see that there's a lot of companies, like maybe a year ago, they were like talking about ideas and then they started to then say, oh, and I'm going to have the betas. And then nowadays, I think we're starting to see that like many companies are releasing like in production, their systems. And so I think in many cases though, they're starting with the most obvious base use cases in some cases, like that then people will then need to adopt and build upon. One of the challenges for enterprises is that like some of them are very scared of AI. Again, for some of these reasons around like AI, like you don't, you have to really know the regulatory landscape, you have to know the security landscape, you have to really think this through. And so even the first use case can sometimes be very hard for them to adopt. And then I think over time, like as they do that, then the next wave of features that are more powerful, maybe the next wave of technology, some of the new stuff coming out of some of our like partner vendors are just like the way that it's not just about text anymore. It's about multimodal images. We start to see some interesting ideas with how people are dealing with videos and audio. Like these things are where I think once enterprises start to digest and adopt some of the things that are coming out available now in production and start to scale them up overall, then you'll start to see that it'll be easier for them to adopt the next level of AI technology that's coming out. And I think this is why many companies today are looking at like, what's the platform that they can adopt to that would then help them deal with their content or deal with their emails or deal with their editors and so on. Yeah, it's definitely, I mean, definitely, I think it's not even the first inning. I think it's pregame right now, but it's clearly some low hanging fruit, as you mentioned. I still like this idea. I think there's going to be a system emerging. There's still some cost concerns. I have to ask you, how do you see the continuing, on the continuum of AI, do you see it ranging from mostly cost savings or to generating new revenue? Where do you believe where this falls in right now? I mean, I can see both sides of the coin, but is it mostly cost savings initially or is it moving quickly to generate new revenue? It's, I think there's sort of a bit of all of it. Like at some point, if your employees and if people who are working have things like AI assistance, if they're able to understand their data better, if they're unable to, and in some cases, like you can actively engage AI and brainstorms, if you can actively involve them in product development. Like these kinds of things, I think will, for almost any company, cross industries, it helps them. And you could say, well, it could be cost savings because then if you're more productive, then you'll be able to save something. But at the same time, what we're seeing with a lot of our customers is that they're not saying, I want to save, they are saying, I want to do more. I want to do, I want to turn that new value into something that I can externally consider to be a business value. I think one of the good examples is like GitHub, like co-pilot, right? Like the idea of using AI in your engineering, like I have not heard of a single person who's like, oh, now that we're just saying 10%, 20%, 40% more productive because of AI, great, we'll get rid of that staff. That's dead, they're all like, I need, I mean, they're limited by the amount of resources they have to hire engineers. So then they're just going to be doing more. They're going to be releasing more. They're going to be making better products. And I think that we're going to see that in a lot of places. Yeah, I think that's a trickle down for sure. I definitely think that's, no one's turning that around. And he just kind of, I was going to my next question was going to ask you, how do you see the generative AI influencing how organizations and your customers, whether it's your direct customers or your partners in the ecosystem, how does it influence how they build software? And does it change the type of software it can build in the future? I think it does. Like many of the customers we talked to, like there's just sort of this traditional aspect of the way that you sell software. One of the first things you want to do is understand like what are some of the problems that customers have so that you can then focus here on how you're a solution to it. But with AI, like one of the first things everybody's digesting is just what is possible? In fact, I had a customer there they just say like, okay, before we start this call, don't ask me what I'm going to do with AI. I need to understand more about AI so that I can figure out what to do internally. And so then this is a little bit more of the idea of AI becomes a platform by which people then think of new capabilities to be able to deliver inside their organization, both for internally, just running their org, creating new external facing products and so on. It's not a product that's in April or totally got, that's a great point. That's worth calling out. I got to ask you on that point, I think data's become obviously we're hearing that on the show in this episode of Data's Key. What's interesting is the conversation has been shifting to kind of like, what was once, passe, an old school taboo is now kind of in vogue, right? You got walled gardens, proprietary data sets. I mean, come on, this is like, if you go back 20 years, you couldn't say those words, right? You got like, no one wants a walled garden. Everyone's going to be open. But now with data being intellectual property, if you look at the large language model, we introduced on the CUBE research team the power law on helping our audience understand that there's a power law of language models. You got the big three or four at the top. They're not going anywhere, but you can interface with them. They're large public models. They call them proprietary, but they're actually public. It's like interesting, right? So like, so you have a mindset shifting going on. You have open source, you got small language models. It could be our transcripts. It could be, so I see an ecosystem more like a neural network API economy around data, where distributed data that's walled garden, but open and disposable is going to be a thing. What do you think about that? I mean, it's kind of out there, but what's your thoughts? Well, I mean, it's a good question. I was talking to one of our customers recently and we had this discussion. Will the AI models of the future be more like, like a coffee or like wine? Like, whereas a coffee, I mean, of course there's so many different flavors of it, so many different ways, but there's kind of like, if you're thinking of coffee, like a lot of people are thinking of big vent, like, you know, oh yeah, I had Starbucks this morning or I had a peace coffee or whatever else. And then, but if you want specialist coffee, they're available, but versus wine, if I say like, name me three wines, like probably everybody's going to call it a different set of wine. So my guess is that you will probably end up more like in the world where there's some very powerful, very dominant models like we've seen so far, like the technology from OpenAI, the technology from Atthropic, the technology from all the big clouds, Google is a great partner of ours. Like everybody is making not only their own models, they're very great, but they also then have these model gardens that you can select amongst these other great technologies. And so we believe that that is going to be, these models will continue to evolve, they'll continue to get very, very good. Now they are very general purpose and they are very powerful. And so then that leaves open this world of like, sometimes you want a specialist model. You want to do, you know, the equivalent of the certain type of coffee, I guess, where you, you know, a certain brand from a certain country that you really like. And then oftentimes that can lead you to a potentially a cost benefit because the models are smaller. It can lead you to a benefit of maybe a specialized or something you're doing. And I think people will use those. But for many cases, like especially for our, you know, we have all this unstructured content, like the more general purpose models are performed very well. And so then we continue to see that as they get more capable, which you kind of just even see in the last few months even. And then their price in many cases keeps going down. Like it's just going to get, those are going to be that one of the first things that people think about when they're thinking about the idea of doing AI for their organization. I mean, it's interesting point. You mentioned the specialty models. What's interesting, and you mentioned unstructured data. One of the problems with unstructured data, as you pointed out earlier, is that it's a sea of unstructured data or a lake of unstructured data. But as you get into these verticals, we have domain specific linguistics or data, whether it's multimodal, whatever mode it is. The vector database trend has kind of pointed to the fact that you can start doing these embeddings within the vertical to get better insights for retrieval, which could also accelerate some of the data transfer, some of the data addressability. What's your thoughts on that? How important is that? Yeah, no, it's like the vector embeddings. I think in an interesting way, like it's like the unsung hero in many cases of the new AI revolution. It's similar technology in some cases about there's a big difference between a really good, powerful, well-trained embedding versus one that maybe wasn't so good. But then you start to see that when you have this giant corpus of data, again, an enterprise might have petabytes of data, hundreds of millions of these pieces of unstructured data. And in order to understand it and get the large language models or any of these foundation models to process it, they have to be able to find roughly the kind of data that you want. And it's so much cheaper, more efficient, better, more secure to be able to retrieve it ahead of time and then feed a little sort of like a single question in as opposed to you try to train on your whole corpus, which is constantly changing, permissions are changing. And so the RAG approach, the retrieval augment generation, that's excellent, the vector databases are getting more powerful seemingly every day. There's a lot of great vector databases out there. So I think this is going to be a big part of many of the especially in an unstructured data world, like the way that you handle on AI overall. It's interesting. We're talking about performance and cost. The word memory comes up, you mentioned RAG, which is retrieval, the retrieval aspect of it. There's memory aspect now of these embeddings and there's no, there's new observability metrics emerging that no one has, if I change my model and how do I know whether it's smarter or not, does it restore its memory as in remembering the retrieval versus actually memory on the machine which lowers your cost for say inferences. We're kind of in a weird time here, don't you think? So you got cost on training and inference and now when you got, I could use maybe open AI for training because they already got built in costs there, but inference now becomes a challenge. And data and memory, whether it's memory, for first token out and all kinds of throughput challenges to inference memory. I mean, this is, how do you rationalize all this? How does, customers, it's kind of complicated. What do you think? At some point, some of the underlying aspects of how the systems work, the compute, the memory, even the pricing schemes get quite complex. And so I think for most enterprises, some of them are going down the path of building stuff themselves. And then of course, many ISVs are building stuff. We build them a lot, but usually I think you wanna start to associate the cost for some higher level value piece, trying to understand the underlying aspects, economics of like, if I rent my own GPUs versus if I use an off the shelf model, like we're big fans of the idea of the, we have big partners who have these wonderful models, so we wanna use those whenever we can. And although in some cases, it can be expensive, you should be looking at the overall cost for value. So if something that was taking a human an hour, and you can do that with AI that would have make that person more productive, and that costs, let's say a few cents or a few dollars in some cases, like that, I think you just start to compare those at that level. And then it's very important to look at the total cost of ownership of these things. Sometimes people get distracted by the like, the token costs or the inference costs of that moment, instead of, and then they try to maybe optimize the wrong things. And so we're big fans of optimizing holistically as opposed to optimizing per one little unit. Yeah, I think that comes up a lot when people talk about end to end versus like going in the weeds on one specific thing. What process are you looking at? That's a great point. I gotta ask you, what are you guys doing in a box that's a compelling, what's the coolest thing you're working on right now? If you don't mind sharing without giving any public information away. Oh, no, we just released a bunch of boxworks. So the thing that we, since we have so much unstructured data and for our enterprises, again, most of the data is unstructured. And we believe that we need to be offering just infused throughout the whole product, like the useful AI capabilities, being able to ask questions of your documents, being able to, as you're searching for things, being able to ask things and have the system go out, retrieve for you the answers and provide them to you while you're actually finding your data, being able to do things like structure your data, help you with security, tagging your data, like these are the things that before, like it would take humans to go do and now the AI can do it and it's just very compelling. And then, of course, on all of that, like we want to make sure that we are also a platform by which people can do AI and their content overall. And that is something that is important to us and to our customers. Is it at some point you have companies like us who are solving these problems around like AI on unstructured data? And in other people, you're talking about some of the wrapper companies, like at some point there's a holistic view of how to do this and we're solving all those problems. We think we're pretty good at it and so we want to offer our services wherever possible to all of our customers. The big joke, or not joke, but the rhetorical question I always ask is, is there an AI operating system coming? Is there, you know, can almost go, there's going to be like a Linux moment for this industry where when people go, okay, there's a system that can run across an enterprise and abstract away all those hard problems about silo data, enterprise search, privileges and permissions and compliance and governance, how do you scale data integrating without clean rooms? I mean, this is like the nirvana. Yeah, yeah. I mean, in many ways, the ML ops that kind of came along to help you with the ML set of problems, that was a very powerful set of infrastructure. And then I've seen some companies talk about AI ops, but really I think that since the AI is so powerful, like you start to say, I need to have like my AI, but in conjunction with like a set of use cases, like on your data or on the way that you're pushing your data out through knowledge bases or other things. Like, so this is where I think that you'll start to see like companies that are specializing platforms for that, as opposed to just the underlying infrastructure, which is kind of how the ML revolution sort of unfolded. I think there's going to be so much action. I think the whole platform engineering under the hood infrastructure is going to be exciting. You know, when I saw Databricks announced, you know, Parquet and Iceberg support, that's a shot across the bow of the industry. It completely democratizes SQL to SQL and unstructured structured data. It's incredible that could be, that could literally put an entrepreneur on the playing field tomorrow with a competing product, a connector. I mean, this is going to open up huge innovation. Yeah, I totally agree. Well, Ben, thank you for taking the time to contribute to our SuperCloud 4 and coming on remotely into our event. Appreciate your time and great contribution with great, great highlights. Like a little masterclass there, thanks. Thanks for having me. All right, Ben Coos, the CTO box here, breaking down all the action, the role of Generative AI in the enterprise and how companies that have the data, working with data, unstructured data, bringing value and taking away a lot of that undifferentiated heavy lifting to create the next generation of infrastructure and applications. We'll be back with more live coverage here in Palo Alto after this short break.