 So I'm John Furrier, Dave Vellante. This is our all-day live coverage in our Palo Alto studios, breaking down the AI innovators. And we got the next segment really talking about the AI infrastructure, the future. Got a great guest, Salman Parchar, who's the CEO of Catanimo startup in Seattle. Great to see you. And this is your time, Oracle AWS and the future of infrastructure is being rewritten. And it's unknown yet what it's gonna look like. You're building it. Thanks for coming on. Hey, thanks for having me. You know, we exchanged some notes about this at the dinner at AWS re-invent. So I'll just speak to a little bit about what we're doing and why we feel that the future as applications gets reinvented, we will be at the sort of at the precipice of reinventing the future of infrastructure. And as I mentioned, I think you guys talk about this in the sort of context of super cloud, which I think is super interesting way to put it. And I like to describe this as intelligent infrastructure. And so if you look at this history of infrastructure for the many, many years, it was distributed systems and computing, they emerged as the internet emerged. And now as generative AI and large language models emerge, how do we help developers harness the full potential of this capability to build intelligent applications? And where does infrastructure has to keep up with that? And so we're building this notion of intelligent infrastructure for developers, starting with a very specific service in this area to solve this perennial challenge around security and observability and really around model choice. But that's what's going on in Caradema at the moment as we play offense with small LLMs rather than the ultra large LLMs. So one of the things we put out months ago, which is being talked about a lot now is the power law of the language models with the foundation models. And if we pull up that slide, I want to get your thoughts on this because Jensen Long at Stanford last week presented, he talked a lot about the specialty models around inference value. He talked about the large language models at the top of the power law. Today it's pretty much a straight drop down with no neck, no torso, but as models start to grow in into the power law, you're going to see large and then small specialty models develop. And a lot of the smaller ones might have intellectual properties data. And this is where the action seems to be. I want to get your thoughts on this because what you're saying is that the infrastructure will be looking a lot like this. Can you explain your vision? Because this is kind of in context to the models themselves in a power law. And again, there's the big ones, they're getting bigger and the smaller ones could be small and be great, right? And still interact. So we're seeing a lot of diversity, but they're going to work together. So that fusion's happening. How does the infrastructure from your perspective, your vision play into this concept? So in this particular concept, if you think about this extra large models, these mega models being built by Google and Amazon by Coher or OpenAI, they're really going to solve this really complex reasoning problem around natural language generation. I think there's going to be hard to compete in that sort of bigger parameters, more training data, more training cycles, more compute. And then if you think about the edge, which is where we're going to operate, is our thinking of how do we separate out the various tasks in terms of generative AI and separate out natural language understanding from natural language generation. And so that for each particular domain, if you're really good at that understanding, you can help developers create more meaningful differentiated experiences, things that they can actually back with their APIs, and that can fit in sort of constrained devices. So I think the excessive large capacities being held by these companies that are building these excessively large models is going to be an area of investment and a lot of money is going to be spent that way. And then on the edge is where we're operating because we feel that there's a whole bunch of opportunity if you separate out these two natural language generation tasks from understanding and create for this new wave of infrastructure that gives developers new superpowers. These models density packed in that can extract a lot further users and so they can, developers, so they can create something meaningful. But we do see it because I mean, of course we're biased, but we're building along this vector on the edge and making sure that we can run on device constrained scenarios and be great at NLU, not necessarily NLG, which is what these other models are going to be great at. So what does that infrastructure look like at that edge and to support those small language models and how does it differ from the GPUs, the large language models that we know in the cloud? Yeah, so if you think about the stack representation of Jenny I, I think it's early days of that at the moment, but if you generally think about the things that have been durable as part of architecture or things that have moved, for example, in the early days of the internet, you had key value stores, new databases emerge, you had the web servers emerge, you had application servers, that three tier text type. And almost every wave of new technology, whether it be cloud native or now on Genitive AI, you will see that people are going to sort of reason about infrastructure in these three or four components. So for us on the edge, we're building this intelligent prompt native proxy. So that does the thin layer understanding of what's happening, what users are asking for and how do we construct and change that nuance and complex and unstructured prompt into API semantics that any developer can create and sort of support a new experience, which means that it'd be a drop in replacement for something you would traditionally put in an architecture at the edge, which it has constrained on devices, constrained on memory, constrained on CPU. But what this does means that you have to still do a whole bunch of training, but when you distill the model down or perhaps produce an ensemble model, that you distill it with the objective to be fast, for it to be good at that one thing. And so we're taking that approach that training and an online inference are two separate use cases. On the training side, we'll spend some cycles and making these things distill down to its real core and then enrich the infrastructure. We talked about this intelligent, prompt-native proxy to be able to do a whole bunch of these interesting things for developers and then enable developers to build more meaningful applications that are prompt-native, that are intelligent inherently. So the training side is still slightly slower and expensive, requires GPUs, but the inference side is where I think there's a whole bunch of innovation that's happening and how to pack these smaller lm so they can run faster and cheaper at the edge. The theme of contextualizing and having personalization or customization seem to be the nice two variables that interplay here. Can you give some use case examples of what you guys are seeing as innovation and what you see being enabled by GenAI? Okay, so let's look at the infrastructure views. Let's jump ahead to the future. Connect the dots for us. What are we going to be seeing? What are some of the things that's going to be automated away? Where does AI actually do some work for on our behalf, either to make things more efficient, cost-reduce, or enable revenue? I think I'll focus on revenue because I think people are trying to tend to build new experiences so that their users can interact with them in new and meaningful ways. So if the bet is that everyone is going to move away from these drop-downs and click-through workflows to conversational experiences and how do we help developers harness that potential of a conversational experiences as being sort of core to whatever they're offering. So I'll focus on the revenue aspect of it at the moment and because I think that's more interesting for developers which are like, see if their new idea in GenAI will stick. On that side, it's just at the moment, just making sure that we can rationalize the new stack for them. Of course, the databases such as the embedding databases are going to be inherently intelligent. They tend to compress these vectors to smaller, smaller amounts so they can pick out the right vector embeddings at runtime. And there's going to be a whole bunch of machine learning models that enable that infrastructure to get better and faster over the future. And then you will see that theme across all these infrastructure components that can be used to build a GenAI app, including the front door that we're building, which is how do we use these small LLMs to enrich the developer experiences that they don't have to hire cracked PhD teams? Not everybody needs to do that. So I think that's the first generation of these infrastructure components that are inherently going to be more intelligent and then they'll continue to innovate on compressing the construction or the management of the infrastructure using machine learning. I think that's the next wave. But the first is how do we help every developer construct something that's meaningful so they can take their creative art and put it in front of users so that the users can start experiencing new things using GenAI AI. And that's our focus now and the focus in the future, I think it's going to be using machine learning to further eliminate any management, the scaling and scaling up or scaling down in a more precise way. And I think that's a great example point to highlight. The developers have the power to move fast, but also get proof points and show that validation to go to the next level. I think that's a key point. And I think... Go ahead, please. I think in this age where you see most of the applications are sort of a thin sum really over some documents and that's all you're seeing today. I think the future what you will see is that people can actually perform really interesting tasks that are constrained to the developer's domain. I think that becomes super interesting because at this point, people are just trying to take over the RAG-based application development framework just says, how do we get some basic summary over some documents? And I think that's interesting in the short term perhaps the head turner, but the true experience would be I can do perform critical tasks like updating an ad campaign, asking for insurance updates and then following through and making sure that the insurance has actually applied things of that nature that we actually pick up the phone and talk to an agent or talk to somebody on the other side can accomplish them in a more meaningful way. That's going to come online. I think developers need a lot more help in terms of the infrastructure, intelligent infrastructure to get there. And so now that you're taking care of all that management headache, that heavy lifting, if you will, I wanted to bring the conversation into the data domain. So you're basically thinking it from the customer's perspective, they're now able to spend time focusing on their data, their data quality, their differentiation. How do you see that playing out? What kind of advice would you give customers in terms of how they should get their data act together, thinking of small language models or small data sets, but really precise with unique IP? I think the same thing applies when I was perhaps 10 years ago with AWS. There are certain things that won't change. The desire to build something differentiated won't change. The desire to make sure that you have a clean data set so that you can train your models. Like there are certain things that won't change. And in the same vein that we talked about, focus on your business logic. Focus on things that enable you to have a more differentiated experience and not focus on the muck. I think those things won't change. So data hygiene and data quality are absolutely for any machine learning training. The algorithms perhaps are commodity, but your data is not. And I think if you truly want to think about, I want to get past this POC stage. I want to do a little bit more than a thin summary of our documents. Then you have to approach this whole thing with what are the building blocks I need to be able to do that? So of course, data hygiene is important, making sure that you have the right cure set that you're going to use in train models or use in context learning. And then be able to put this stack together that accelerates your development. I think that's what's missing in the market in general, but I think people are catching up to this idea that we need a stack representation, which gives the developers new powers, but focus on business logic. Yeah, I think you're right. Earlier this morning, we had Walmart on and Uber top and distinguished engineers talking about their system. And these are forward thinking. We also had some hot startups on like your sales and others. There seems to be a trend towards these systems, new kinds of systems that are a compilation of DevSecOps, Advancement, CloudNative, NextGen, Scale, and obviously the generative AI is powering this kind of movement. Uber's been around for a while and you have experienced a lot of the infrastructure side at Oracle and AWS. So the question I have for you, what's your observation on who's building this stuff? Because if you have an end-to-end system, there's no one observability solution that's going to work. There's no one database that's going to work. You're talking about an end-to-end system. You can't just come in and put it. So all the vendors that want to sell stuff to an enterprise, they're like, we had this before, but it doesn't really fit into the new thing because companies have a platform engineering person, a data engineer, an ML ops person, a Kubernetes person, and a DevRel, okay, and a CISO representative. Like six people, like what do you call this group? Is it forming, is it just DevSecOps, platform engineering? Where's this, what are you, how would you? So what we're seeing in the market is that people are getting these autonomous teams put together with budget and they've been giving a little bit of rope to go invent. It very much feels like the early AWS days where developers had been given a little bit more agency and making the choice. And right now it's sort of this really stove pipe tooling approach to everything, which is I got a whole bunch of tools I'm going to cobble together and to do something. So, but to two observations. One is I think people are carving out budget and autonomy so that people can move a little faster. I think that these things will emerge into the workflows that platform engineering teams, DevSecOps teams will have to sort of approve the final bits that go out. But right now we're seeing a whole bunch of, okay, there's a small autonomous group. If we've been given a bit of a charter and a little bit of money, go go build. And do it in a PSE state because we have to get all these other stakeholders in line, but that's just one. But then the other flip side is the emergence of any interesting technology. You're going to have a whole bunch of tooling, but then on the initial forms of this tooling is like these stove pipes that you've got to like merge together and converge. I think you will start seeing more infrastructure play sort of take a very systems approach to this and have an opinionated stack in the early days. Eventually I think the cloud native foundation perhaps will have a, you know, a bit of a role to play in this in terms of standardizing some of these systems as developers, you know, vote with their usage. But you will have these two things in the early days. And as we see some users who are actually delighted by this experience and now you want to take things into production, that's the next way, which is how do you get the DevSecOps people, how do you get, you know, the platform engineering team really bought into the new system and it incorporates the ideas of the past system, perhaps redesigned very differently because prompts are very different and Genai is very different. I think this is going to be a big opportunity for startups and mid-range, B-round, C-round companies that are growing because the older in companies might not have the product to deliver or the professional services. So great opportunity. Let's talk about your company, okay? You guys are at the front end of this wave growth curve. We always say this is two modes. If you go too fast, you're in front and you become driftwood or if you're too late, you miss the wave. So, you know, I won't say timing the wave, but if you're riding the wave, you're going to understand the situation. Tell us about your business, what you guys are doing, what you have your eyes on in terms of the value proposition and how you guys are attacking that and what are some of your early deployments look like? Yeah, so in the entire company's thesis is on things that won't change versus things that will change because it's super hard to predict what will change. And I think developers who we are targeting with our infrastructure services, our intelligent infrastructure services, will always want to balance costs with quality. They'll always want to make sure that they have choice, they have control. And so when we think about how we build our business and we create the sort of set of capabilities, we're looking at things that don't think that will change. Meaning, we know at this point, the rate of large language models being introduced to the market is sort of hard to keep up with. So what you would need is something that enables you to sort of test and try these things in production and make it easy for you. As the intelligent, prompt native proxy that we're building has these routing capabilities built in. The model choice and credentials management is super easy. Like people would want control over the experience. So function calling and being able to translate prompts into APIs that are hosted by the developer. Super interesting. That won't change. The desires won't change. But that's what we're focused on on things that won't change and build a business around it. In the early days, we've taken a very perspective view of the type of infrastructure we're gonna build, which is this proxy service, this open source project that's gonna be announced very soon and that uses small language models to harness the full potential of large language models, if you would. And I think that will be the constant bet of this venture which is focusing on what is gonna be perennial. Not what's gonna change tomorrow. And that's frankly speaking infrastructure has to be built that way. It has to have that durability and we just start to play into that and repeat a playbook of the past but design it very differently. So if I'm a customer, I'm a user I have a theCUBE, we have a large, we have a language model small. All the CUBE interviews have ever done. We've got vector embeds, it's got RAG, got retrieval. It's a nice search edge. We link to the videos, it's small. But I was talking to the CEO of Kong, they do a lot of API work and they're like, hey, why don't we build an, you should build an API into the CUBE language model. So, let's just say I did that. How does that work? How do you see that developer? Because I could see a model where developers will call language models as the service and say, I need some B2B talk or maybe some CUBE. It's Kubernetes to spell it properly. Or there's kind of jargon that we might have. So every small language model is domain specific. Well, not all of them, but most of them will be. Yeah, yeah, I think the nature of the beast will be that small models because they don't have a lot of learning parameters will tend to want to focus on a smaller domain. Perhaps a financial domain, infrastructure domain, or what have you and the ability to engineer and deliver these models for the right use case at the right time is going to be a very important challenge in the future. This notion of being able to call into these models and have sort of a marketplace that you can sort of simply be able to spin up the things that you need for that particular use case. And that's certainly interesting. I think there is an emergence for that as people get settled down on use cases. And I think you guys are perhaps ahead in terms of your formulation of the use cases. And there will be a need for that. Today, for example, we're not focused on compute orchestration, which is where this would be. We're focused on sort of data and ingesting that prompt and making sure we make right decisions on behalf of the developer and the user. So I think there's an opportunity, and as you mentioned for startups in general, to look at the space entirely net new and saying, how do we attack it so that we can help the next generation of developers be more productive and successful? And there's going to be folks along this journey that are going to make that more easier on the compute side of things, especially around the access to a small model, precisely trained on specific data. And it's ephemeral because when you donate it, you don't need it. And I think that there's an opportunity to be it for sure that on that particular type of opportunity today, we're just using and training these small models which are specific to domain for natural language understanding, and then enabling developers to build and construct a great gen AI experience that can harness the full potential of these mega models. And Salman, you mentioned you're not focused on compute orchestration, presumably either not your expertise, somebody else is going to solve that, or you feel like that will be solved, or is that a dependency in order for you to have success? I think developers right now tell us they can't ship anything to production. And so if you ask them why, there's a nervousness factor to, oh my God, I'm exposing these end points, but not having any security on adversarial prompted attacks, for example, gosh, I need to make sure that I can balance cost with quality and route to GPT 3.5 versus GPT 4. So they're focused on these more perennial challenges today, and as it sort of emerged from that and saying, you know what, now we're going to train some small models and make sure they're set right next to my access to a very, very large model, then I think you go to the next stage of infrastructure development on behalf of them. You know, the survival of any startup is making sure you pick up the problems that you can solve for developers today. So we're trying our best to go focus on that and build a business around it. But if you ask us in terms of pedigree, yeah, we built these infrastructure systems for 20 years each in the team and over 80, 90 years combined. So we feel like we can do that, but we've just got to focus on one thing in the early days and then focus on the next one and we get an opportunity to serve customers. It's early days and Embriana has a good focus to focus on the developers because you've got to be where they are right now and they are in need to get stuff into production that's coming, you've got to hit their needs now. Final question for the last minute we have left. Talk about your company, give a plug for what you're working on, how long you've been around, what you guys are looking to do, are going to hire, you mentioned this open source project, give a quick plug Solomon for the company. Yeah, totally appreciate that opportunity. So we like to describe the company as operating at the intersection of science and hyperscale. We believe that is going to be how the next generation of infrastructure is going to be built. So if you have a lacking capacity in science, you're really not going to build something interesting or if you just have an over weighted capacity infrastructure, you're not going to be capturing the net new opportunity. So Colonymo operates at this intersection of science and infrastructure and we're building this first layer of this intelligent, prompt native proxy, an open source project which we will make available soon to developers so that we can get started on the journey on inventing the new intelligent infrastructure. In terms of hiring in terms of people, of course, as I mentioned, we have applied scientists on the team. We actually have a professor of AI on the team and we have folks who've built out this type of infrastructure at Microsoft AWS and Lyft and their past lives. I will continue to make sure that we put together these two cohorts and create something magical at the intersection of science and infrastructure. And that's always been 25 plus years of my career of expanded infrastructure. And so I just don't know what else to do. Well, the world needs you right now as AI innovator. We need you. And again, I love the pragmatic approach meet the developers where they are now. You kind of know what's going to happen in the future but you got to kind of get that way to bake out a little bit on that side. But yeah, there's a real need for this infrastructure to evolve faster and reliably too, as well. And again, at scale. So the developers can do the magic and we applaud you and we're on the same vision too. Congratulations and we'll be keeping in touch. And thanks for coming on and contributing to our show today and appreciate your time. All right, thank you so much everybody. I appreciate the time. Okay, thanks David. Here from the founders, we hear from the founders making it happen. They're innovating. We've got the venture capitalists. We've got the big companies who are on the front wave of this on the major growth curve. Of course, SuperCloud 6, we're going to have a wrap up after this live feed right now, after the short break.