 Hello everyone, I'm John Furrier with theCUBE. We're here at Palo Alto, California for exclusive news around generative AI. Of course, we've been covering it like a blanket for many months now, and it's been a surge of activity. We're here with two CUBE alumnus, the CEO and co-founder of OctoML. Luis says who's been on theCUBE many times before, going back before the craze started, we were chatting a lot about AI. Welcome Luis to theCUBE. And of course, the investors were drone adventures and John Tru is on the board of directors of OctoML, also an expert in AI as well, former AWS, now a VC partner. Matt McClain did the original investment, giving props there. We're here to discuss the hot news in generative AI that's being released today. Luis, John, thanks for coming on theCUBE. Glad to be here. Okay, let's get into it. Luis, we've talked many times about what your role has been in the industry, obviously with OctoML, hot startup. Generative AI has been going crazy. You have news that you're announcing today, first optimized compute service for AI, which just sounds like a great product. I want to hear more about it. A platform you guys say that's going to help developers have a fully managed cloud infrastructure to make it easier to write AI apps. Basically, take all the heavy lifting and abstract away all the complexity. Let's get into it. What's the news? This is the hottest area. Generative AI, people want to do more. GPUs are on allocation. You can't get any. There's all these different services out there. A lot of confusion, a lot of noise, but it's hot. What's the news? All right, so we are releasing today the OctoAI compute service. So it's for self-optimizing compute service for generative AI. It offers freedom because it lets users to go choose their model or bring their own custom models. Second, offers efficiency because we optimize the models and we choose the right hardware and we make sure that it gets the right performance efficiency trade-offs. And then second, it's very easy to use. So we make it very easy for folks to get started. We offer a collection of super-optimized models like stable diffusion, Nalama-based LLMs, Whisper 4, audio transcription, and so on. So with freedom, efficiency, and ease of use, offering this service that makes very easy for folks to build really amazing generative AI applications. There's a lot of confusion out there around how to get into the gen AI business. Some people are leaning into it, have been doing well with it, but the rapid pace from idea to code to product have never seen this kind of acceleration. What problem are you solving for the developers? I'm asking this, interest is high. What problem are you solving specifically? Yeah, so several problems. First, abstracting away the complexity and helping clear this confusion, right? So we offer the ability for developers to come to the platform, select a use case, for example, text to image or text to text, and very quickly get started with state-of-the-art models ready to go and ready to be integrated into their environment. And then we also abstract away all of this incredible complexity that is involved in putting a model into production because once you have a model to deploy, the path from there to actually deploying in a way that first has the SLA that you need to make your application usable, and then two has the right cost and scalability properties is a lot of work today. And we completely abstracted away and make it fully automatic. John, you've been covering this area. We've been on theCUBE talking about this before. Madrone, obviously early stage investor here in the company. The world spun right in the front door step of OctaML. What's your assessment? What do you see in this new platform? What's the outlook? What should people think about it? What's your perspective? Well, what's really exciting about the world that we're in right now is there's almost sort of an Android moment that Luis and I have written about before that look, we have very exciting models. You might call them the iPhone kinds of models, things like GPT-4 and model from OpenAI and chat GPT, Anthropic, Cohere, lots of other models like that that are genuinely really powerful, really exciting. Good. There is also a world of open source AI models that are really useful in their own right and together with the closed models. And the open source models are special because they let developers and scientists push the boundaries on functionality, on data types, on cost and on latency. And you can assemble these models into what are called ensembles. Some people call them cocktails of lots of models that work together. And while that is good and we've seen great companies like Runway ML and Mid Journey and some others built on top of open source AI, it's been really difficult to get started with that until now and to manage it and to run it all. And so what's exciting about what Okto and Luis and the team were doing is that they're going to be able to give for the first time the kind of ease of use with open source AI that you're getting with the closed models. That's going to unlock lots of new innovation. Let's get into the open source content here, Luis, because, you know, John, we were talking before this interview on text and also in person about the long tail, there's a lot of long tail stuff coming out. You got the power law. I've never seen power law applied to this kind of phenomenon before, but open source is making a real impact. More people are coming in, a lot of experimentation. People are looking at different scopes of the size of the models. They don't need to be huge big models. They can work with existing. There's a blending cocktails, if you will. Why is this important? What's the big thing that's going on, Luis? Why is this so successful? Well, I mean, the pace of progress in these open source models is just absurd and they're showing to have value, right? So I'm talking about literally on a weekly basis, there is new models that are released and these models actually they're useful for problems that businesses have today, right? So now the big problem here and then the anxiety that this generates is that how do you ride all of that? How do you actually, you know, how do you harvest value out of these models and into your business in a way that's sustainable? It's exactly your goal. I mean, I talk to a man with extremely focus in making these open source models that are moving so fast work for you to work for the business, right? So in that, you know, the application creators focus on what experience they'll offer their users and abstract away all of the complexity infrastructure and details involved in getting these models to actually function in production, right? So that's our laser focus here is like let businesses extract value from these models that are moving really fast. John mentioned proprietary Android iPhone. I like the analogy, but it's interesting. OpenAI is runs on, doesn't run on AWS yet. Anthropics does, OpenAI is on Azure and yet cohere out there, they interplay. So it's, and there's interaction between them. What is this, is this part of the equation because I see lock in there. I don't see a lot of choice if it's something runs only on AWS or something only runs on another cloud. What does that mean? Cause there's consequences of that. I mean, OpenAI is not going to go and run on any cloud. What good is it? Or is that going to be proprietary? We back to the old network protocol stacks where, you know, you had to pick something and stay with it. It's just, how do you guys talk about this and how do you see it unfolding because it doesn't make any sense to me. I think there's a very early moment that we're in where a lot of these questions are open. You can use OpenAI from AWS. It's just a little harder, things like that. And that's the moment that we're in, but the clear movement from enterprises and from developers that I speak to is they want to access this powerful technology and they want choice of the best model and the most useful model and combination of those models for a use case that they're building. Luis, what's your take on the thing in theCUBE that developers in OpenSource are the de facto standards bodies these days? They're the ones driving the change. If you look at just what's happened in the past four months alone in OpenSource, it's been pretty incredible. Explain the phenomenon there. Yeah, so let me answer a mix of the two questions that are in the air here, right? So I'll start with, first of all, it's something valuable that these closed proprietary models are offered by, say, OpenAI, have incredible functionality, right? So, and it takes a lot of money, a lot of effort to actually build those models and understand, and gradually they're building a business, a business out of it. And these models, models like GPT-4, they do truly surprising things that are very, very powerful. But what John said is really important, which is like, it's important for users to ask the question, what is it they want to solve? Do they want to solve a question answering service over a set of documents? Do they want to do a tax summarization service? Do they want to do tax classification? Do they want tax image? And once you understand your use case, it became really clear that, there's a large set of open source models that are quickly progressing, that can solve those specific use cases and address them very, very directly. Now, and the end user advantage here is the following, right? So you can, for the use case that you know, they can be fulfilled by an open source model or by a customized version or fine tune version of those models. You can control the deployment of that model. You have control of where it runs and how it runs, what data it's gonna touch. You have full transparency, what it does. And you can assign, you can use that in components of your application that makes sense. That said, it's also important to realize that once you need a functionality that today is only available in the large preparatory models, you can still call them. The reality today that John alluded to is that people are building with model, building with an ensemble of models where you combine a collection of open source models that do things that you know that they do well, we can validate that. But then once you need this functionality only offered by the closed models, you can call them, right? So all that said, I think like if you zoom out on the market at the timeline too, it's interesting to see that the gap between, this is the point that John and I made before and supported by others in the community, that the gap in functionality between close preparatory models and open source models as a functionality is shrinking, right? So, and then if you project this out over time, I think it's easy to see a world where yes, you can probably fulfill all of your, you'll probably be able to fulfill all of your AI needs out of open source models. And that means that the big challenge then is like, do you have a platform to run them on, right? So. Yeah, and I think this is what I like to do. Let's get back to the news. So GPUs are on allocation, they're hard to get, they're jacking up the prices, NVIDIA stock is soaring. Although they just invested in Coheir recently last week, we saw that news, NVIDIA is a player, okay? Amazon has cloud and they have great footprint, carbon neutral they say by 2025, and these large language models aren't helping the world. I mean, talk about carbon footprint impact. You know, I was talking to a customer, they said the sustainable budget, their budget and their metrics are exploding off the charts because the sustainable goals are being not met because they've been blown out by all the compute and that. So it's a lot of pressure. At the end of the day, the developers just want to code. What do you guys offer the developer? What's the pitch of this platform? Why use you guys? What does this mean for me? I want to get in, get my models nailed down. I want to understand how it all fits. I'm tinkering, I'm kicking the tires. How does it work? Yeah, absolutely. So, and this topic's obviously near and dear to my heart because of the OctaAI compute service, but also because I'm a computer architect by training. So I like seeing chips being so important in this new phase of the world here. So what we offer to the developer again is the ability to not have to worry about infrastructure. What does this really mean? It means that really making the deployment viable, which in the first instance is you have access to the Silicon, right? So because of our ability to deeply optimize the model and how it runs on the actual hardware, we can actually offer a choice. You may not need an A100 and you might actually use an A10G, for example, or maybe even a T4 that's readily available. So giving, offering the optimizations as such that use less computes coupled with the ability to move the actual work around and abstract that away from the end user point of view gives you access to more Silicon fundamentally, right? So, and then that directly leads to more cost efficiencies, right? So cost efficiency will come from literally making the code faster and run better, that's one. And then the multiplicative factor there is the ability to run this on Silicon that has better pricing, right? So and more availability. And then all that said, of course, we're supporting and working with Amazon and supporting in France. And then we're going to, you know, we're going to expand that offering to other providers as well, right? So abstracting away the choice of hardware such that users can focus on building their application which is what really matters is a significant part of our mission. And the use cases, you guys, what do you see coming out of the gate? Customization, is that the purpose, finding their differentiates? Yeah, I mean, one clear use case that we see extreme need for, and it's very much featured in our launches, text-to-image, you know, it's just interesting how text-to-image is a great example of an open source model like stable diffusion, deep frogging what was around there that was closed before and then showing that it can offer amazing results. So text-to-image is, you know, a very significant workload in the world right now. We support that. Users can come build on with stable diffusion 2.1. We also guarantee you have the latest and greatest optimizations, the ability to fine tune that model and the ability to change the model weights, you know, when we place them such that it can customize for a broad set of users and quickly switch between them. So that's one clear use cases, text-to-image. Another use case that we're supporting today, ready to go from pre-optimized models is large language models for, say, text-to-text, right? To do summarization, to do text expansion, text classification and so on. The third one is audio transcription, turning, you know, audio into text. You know, we support that as well. But we also support, you know, customers bringing their own models and building their own custom containers, right? So go and run it on the service. So we have an authoring service for that. Anyways, yeah, so you can, that's, those are some of the key use cases. I also want to point out that we've been releasing steadily some exciting demos built on our compute service. One of them is the InkMM multi-model. It's the first open-source commercializable multi-model model that you can upload. You can offer images and ask questions about it. It's called InkMM, you know, and we show how that was built. There's other demos, cool, you can make your photos look like it's been cartoonized and then we show what the code looks like there. So these are some examples of use cases that we see that users are really excited about. And then just like an example of what people can build with, build with, you know, very, very quickly. John, please compliment that. John, way in here because you're watching open-source. There's innovation happening. What are you seeing in the landscape that supports this need to get tinkering, get building, get experimenting, you're mixing those cocktails? I mean, you're seeing performance enhancements, all kinds of, what are you seeing? Yeah, so you're going to, it's about pushing the envelope on one of a few dimensions. Customization to push the envelope on either new features, new types of data, lower cost, higher performance or privacy and, you know, sensitivity of data. And when you speak to, when you speak to developers or CEOs or CIOs, what you're going to hear is some combination of those items become very, very important. And this is not to diminish the amazing work that OpenAI and Coherent, Anthropica and others are doing, but there are cases where a CEO wants to take her own fate in her own hands and open source makes that possible. That's awesome. I think too, the support that's going to come out, open source, you know, everyone said that we never see another red hat. Maybe there could be because you got to get in there once there's momentum, you're going to want support. Enterprise is going to want support. Another big part of that, Luis, isn't that true? Absolutely, yeah. We have world-class customer success team here and they're ready to help. And, you know, some of our early access users are ready, you know, enjoy that benefit and now we're opening to the world. Very easy to help folks to get started and have a world-class team of ML engineers and solutions engineers to help folks build with it. Well, congratulations on the big news, a platform infrastructure managed service. Awesome, get people standing up their models, testing them, customizing them, tuning them up, you got to helping them do that. Well, in the last 30 seconds we have left, Luis, give the pitch for the developers watching, for the press that are watching this video. What's the big deal? Why Optimal? Why this platform? Give the pitch. Yeah, so releasing the OctoAI compute service, you know, a place where you can come in and build your AI applications with freedom of choosing your model, the model that you want for the application that you want, highly efficient, make them run fast, a low cost and give you control. And finally, ease of use, we make it very easy to broaden the set of creators to go build with this, you know, with a set of world-class models ready optimized to be used and then, you know, very, very clear instructions and quick start guides here. So that means it'll be more people building with it, more innovation and better life here, so. Great service, helping the developers, getting more innovation, getting that flywheel going, faster, smaller, cheaper. That's the way we like it in open source, Luis. Thank you for coming on, John. Congratulations on a great venture. Say hello to Matt for me and congratulations on the news, Luis. We look forward to following up. Thank you, John. It's right out, everyone. Octo AI, complete service. Bye. We'll be checking it out. This is theCUBE exclusive news from Octo ML. I'm John Furrier, host of theCUBE. Thanks for watching.