 Hello and welcome to this special CUBE conversation. I'm John Furrier here in the Palo Alto CUBE studios in heart of Silicon Valley. I'm your host of theCUBE. We're here with Howie Xu, CUBE contributor, always part of our super cloud events. Friend of theCUBE's been a CUBE alumni so many times, senior executive, been an entrepreneur at early days at VMware, done many things, legend, great to see you as always. Good to see you. Always having great commentary, great to see you. Yep, first time in 2024 and happy year of Dragon to you and everyone. It's going to be a year of the Dragon. Super cloud six coming up. It's going to be very focused on startups. We'll see you there. And I know you got a lot of interviews lined up. I'm going to line up a few entrepreneurs, executives as well. I love having you on theCUBE and the CUBE community because you're doing interviews. You're actually a host now. You're kind of quasi host, but you're also senior executive, entrepreneur and technologist. So I love the way you put those panels together. So I want to say thank you. All right, let's get to it. We are talking about AI and Gen AI and hot updates from the field. I follow you on Twitter and LinkedIn. You're all over the, pounding the pavement virtually and physically in Silicon Valley and beyond. What are you seeing right now? We're seeing a lot of action, a lot's changed in the past few months. One, everyone's funding's through the roof, but the landscape is starting to emerge. You're seeing the open source models getting up in functionality and adoption, almost closing in on the proprietary models. Look at the graph of adoption. We're seeing some hot startups, Google's announcing new context window with their Vertex Gemini product. What are you seeing? Gemini Pro 1.5 yesterday, right? A huge deal, one million tokens and then 10 million tokens in testing. So it's huge for the large language model. And then of course, on the same day, Open AI announced Sora, which kind of got the, I was joking with people, attention is all you need, right? They got all the attention, but give them the credit, right? Both Google and Open AI, amazing models, amazing progress. It's a lot of hype cycled, everyone's hyping themselves up. Sam Altman is the ultimate PT Barnum in this market right now because he's actually kind of run the circus. That is the AI circus right now. He's got the chip conversation going. Seven trillion. Seven trillion with a T. And he, he, he- Have you ever seen that before, by the way? He upgraded to eight trillion yesterday on T. Have you ever seen a press release or a new story that had the word trillion in it? In terms of investment? Yeah, to be honest, you know, like probably like everyone else, right? You know, when I saw the new seven trillion, I thought, wow, seven billion, you're kidding me. Why seven billion? That was my first impression. And then, you know, literally it's the T, right? Yeah. T stands for trolling, by the way too as well. So he definitely is trolling the media. And I think his point is, I think his point is, look, we're thinking big. And I think that, I mean, I think the numbers be yes in my opinion. If you look at the numbers, there's no way that's ever going to happen in our lifetime, because that's the whole world, population, valuation, all the tech... 10% of the GDP, the entire world's GDP. By the way, this is how much he wanted to raise. The valuation would be, what? 10 times of that. So that's entire world's GDP. Okay, so he did troll everyone and the press falls for it. I mean, they'll fall for anything these days. But the point is, I think it speaks to the magnitude and scope of this opportunity, how we, you and I have been talking on camera and off camera about the ways we've been involved in. And this wave in particular is a large inflection point. This is, from an order of magnitude standpoint, it is almost hyperbole if you didn't believe it yourself that this wave is actually very, very big. And it's changing a lot. And then just changing every day. It's just a huge opportunity. Yeah, I think there are two sides of the coins here. You said it's a BS, get to that in a second. The trillion, I said it's BS. A trillion, yes. Not the way. The, look, I think where he came from, I can see where he came from. They just released the SORA, the model. And then everyone was amazed by it. Guess what? He can easily, easily use 100x of the computing power he has today. So with that, two out of magnitude or three out of magnitude, that's 7 trillion. So I don't think it's sort of a completely irrational if you look at a front demand point of view. He can easily consume 100x, perhaps 1,000x computing power. But on the other hand, from a technology point of view, it's more than just the GPU. Once you have the GPU, how do you stand that up? You need the space, you need the power, you need the cooling. It takes forever to even build that out. One of my friends is from one of the big cloud service providers. I met him over the weekend. And he said, everyone's busy just scooping the data center. The bottleneck is actually the space and the cooling and the power, not just the GPU. So I think that's something for people to keep in mind. And then another anecdotal thing he shared was, literally, he's a senior executive negotiating this deal with the data center of nowhere. That's where most of the data center would come from. And guess what? Satya would fly to this nowhere and bring the check. And in front of the people, I said, I'm here. I can sign the check right now. You have the CEO of Microsoft do that sort of the things. You can see that how big the data center is, the space is, the cooling is, the power is, besides just the GPU. So just to add that perspective. And also, I know you've been talking a lot with them. You're going to have Harrison on Chase from LangChain coming on theCUBE. You're going to host a panel with him and among other startups. They just had news. LangChain is now generally available. Yesterday. I love that company. I've been following him on Twitter. There's a bunch of other companies doing a lot of rag and a lot of retrieval augmentation generation stuff. And so there's a lot happening and people are trying to figure out, do I do embeds? Do I have vector databases? Where do I hold stuff? What do I have with the data? Do I need to be an AI company to take advantage of my data using retrieval? So there's a whole now plethora of conversations about that. And then the basic stuff like, where do I host this stuff? Because Lama's great, but you got to host it. And is it worth hosting if you just go to open AI? So again, you're starting to see things form on the landscape that we can see out in the battlefield, out in the valley of opportunity here. What are you seeing is getting most traction. Are people hosting and adopting and they're building their own? Are they going to use the managed services? What are you seeing? What are the approaches? I mean, there are two things. One is the frontier model versus the open source model. Open source model is advancing. Lama II was released mid-2023, actually got an ecosystem gravitated towards that, which is actually pretty good. But then, use of Mistro came out towards the end of the 2023 and it came from nowhere. And then now it was part of the talk of the town for the last few months as well. So I do think that the ecosystem of the open sources has life. But then the frontier model guys, just to have too many GPUs, have too much space and the power and the cooling down of what I was talking about, they can afford. I mean, obviously they have the talent, but they could afford to train the large model. And then I think the smaller guys whether open source or one-out, they have an issue. I mean, Meta is going to spend $10 billion on GPU alone. And then I was looking at an earning report of the Google, 11 billion dollar, and an earning report of Microsoft, just one quarter, 11 billion dollar for the CapEx. And then they said that they are going to have step-functioning increase for the CapEx. So I think the service providers, the host data guys has an advantage because they have the cards, they have the space, they have the cooling. The hyperscalers have great advantage. They have great advantage. I mean, look at Meta, Meta just announced Hop-Tan, CEO of Broadcom on their board as well as Guy John Arnold. Both have different experiences. They bring to Meta. And with Lama and the success that Zuckerberg's having, and he's got a huge platform. He's got a lot of data too with WhatsApp, Instagram, and Facebook with the large scale. They could be the next Amazonian. It's a very small move, right? Hock just, not just the hardware perspective, the Broadcom, but also Hock-Tan has a lot of, is a very financially disciplined guy, right? Can bring that perspective to Meta. So I think it's going to be a great addition. The chips, the apps, it's going to be great to see. And also the cooling and energy required is another constraint. So it's going to be hard to get into this hosting game if the barriers to entry is so high. And that's why I kind of liked this efficiency that we're seeing on the open source side that we can run it on a mobile phone or a Dell computer. You can start to see that kind of configuration which should open up the edge and maybe personal devices. Yeah, I think it bears, it leads to a question. What is the generative AI native applications, right? So far everything we talked about is still the kind of applications that came out of the last generation. Do we even know what is the generative AI native applications, right? Should that just be a chatbot? I think it's going to be a lot beyond that. But what is that, right? That's a, I guess a trillion dollar question, not even not just a million dollar question, right? So I think a look, it's just like going to be like internet era you and I went through, right? First, it's the infrastructure guys, you know, taking advantage of, right? You have Cisco of the world, right? You know, Cisco at one point was the largest market cap company, you know, in May 2020, actually, believe it or not, right? And, you know, NVIDIA is going to follow a very similar path. They are going to benefit from that, right? You know, when internet first boomed, you know, NVIDIA is just like going to be the Cisco of the world of 20 some years ago, right? But after that, there will be the, you know, internet native applications, right? YouTube of the world, right? You know, cloud mobile of the sort of native applications. I don't think that we have seen, you know, we even understand what is the gen AI native applications? But I think we don't have to wait for 10 years. That's a good news. You're learning, you're seeing fast. So let's jump into that because I think that's a great point. What does that even mean? So if you look at like the trends, infrastructure investments are high right now in the enterprise and in startups. So to see people figuring out, people are going to need to host stuff, either foundation models or whatever data of their own with a model. CNBC has had an article last December, right? You know, lots of profits for NVIDIA, lots of, you know, hefty experiments for the rest. And then Wall Street Journal, an article yesterday, it said the co-pilot revenue is dropping a little bit. Some people are turning it off. Maybe aren't that happy with it. Dave and I had a big debate on the cube part of that. I was saying that's just the fluke. I think people are going to continue to see that subscription work, but that's a different story. So you get the infrastructure piece. Then you have people with data so that don't need to have AI per se or a model. So if you look at the power law that we've been saying that the power law curve, the long tail will be companies with data that just might want to do RAG or retrieval augmentation generation, that's it. And don't need a model because their language is good. It's pure parochial, proprietary, or their own IP. But they can integrate that with other workflows or models. So you start to see this notion of mashing up models. Look, every Fortune 500 company is doing some kind of the co-pilot, some kind of their chatbot, some level of the RAG, right? You know, maybe it was a little bit fun tuning, but RAG is almost the table steak for everyone. So everyone is doing that, right? I personally don't view RAG as the generic native applications. I mean, it's very much like in the early days of internet, everyone just bring their goods, used to be on the physical floor, right? You know, bring that to online. So call it e-commerce, right? That's one way to look at it. But I think we can do a lot better than that. So RAG is like the first step, right? It's the, I wouldn't say easy. It's like the content. To use a web analogy, if I'm a content producer, I host my web page somewhere. And move things. I sort of digitize my goods on the web page, right? But I think- And you AI your goods with or embed. You make it compatible with other things. Do you see it the same way or not? I see it the same way. I think with the enterprise, because you and I look into the enterprise companies a little bit more, right? So with the enterprise, it's a little bit harder than the initial version of the internet e-commerce because the control, because when you do the RAG, right? What's the answer? Hallucination, you know, how do you control, right? The AI is there are some unpredictable part of the large language model, right? So people want to have some control of it, right? You know, whether for safety reason, for the quality reason. And it's unlike the traditional software. You can write a piece of code and then just to put the boundary here, right? The boundary is harder to sort of rain and then to control. So that's why this thing, this time, right? The RAG, people started a Fortune 500 a company started a RAG experiment about a year ago. And then they thought, you know, by the end of 2023 they all have production stuff. But that didn't come- That didn't turn into- It's embryonic. There's no observability. How do you figure out memory? What's the answer? Can you repeat it? Can you catch? There's no real system yet. It's still growing. Observability, safety, security, you're right. Privacy, rights. And then we haven't talked about the cost and the latency, all that sort of things yet, right? So I think it would take another year or two for those in the RAG-based system to be in production. When I say in production, not just serving internal people, but also serving external audience and make money off that thing. The big question that everyone's asking themselves and I'll ask you this question because I know you have a good answer is I got to make investments. I'm a year into this, I've been entire kicking, I've been laying down some experiments. Never a bad experiment as I always say, but now you got to start putting some, the rubber to the road, meat on the bone, whatever you want to call it. People have to start producing some directional evidence that there's a pony in there somewhere with AI. So what do you see as best practices of people that are knocking down good use cases or best practices to get value, whether you're an enterprise or even a startup? So where's the startup action, number one? And two, where's the enterprise businesses getting value from AI? What are you seeing in those two areas? Let's start with the startups. Yeah, look, we are living Silicon Valley. Silicon Valley is a bubble by itself. There are two things, there are two routes for people to do. One is do something to entertain the audience within the bubble and then do something outside the bubble. I tend to believe that there are probably more opportunities outside the bubble because within the bubble, a lot of the functionality, a lot of the things people can do, I would say Google can do it, Amazon can do it, and then Microsoft can do it. Why you need a startup to fill that gap? I'm not saying there's no sort of gap, but that's sort of, but outside of that bubble, there is a, I mean, Silicon Valley has what? 5% of the companies on the planet, or maybe at most, but there are so many traditional companies, right? They actually, they barely have a lot of the digitization stuff done. Now, you know, talking about AI, how can you marry the AI with their data, with their workflow? I think there's a lot more potential, a lot more things to be done over there. That's sort of how I see it. You know, I always view pharmaceutical. I always view the startup. Drug discovery. I always view the startup opportunities in markets where Andy Grove used to call that blurry collision between two worlds as the atoms collide. He had a famous talk at MIT, talk about when your markets are in transition. You have this kind of like weird state, not yet seen, but the visionary see it first. The best ideas are often the ones that are weird or look different. And I think this whole idea of, well, the cloud guys could do everything. Why even start a company? Because Amazon could do it, or Google could do it. So there's always a contrarian entrepreneur out there who's going to say, no, no, what I'm going to do, we're going to be game changing. So there has to be something out there. There has to be something, but your workflows as a data is something new, what is that thing? Yeah, and I would say something controversial here. If this is a pure rag, I think that's the opportunity that sort of belongs to Google and then Microsoft of the world. It has to be beyond the rag. I think a rag is a table stake. It's not, just because you do rag rights, I don't know that's a company, that's a generational company. But it has to be- It's a feature. It's a feature, it's something that, I mean, we haven't solved that problem, right? This is a weird, right? As much as, you know, I say this is kind of a, it's not commoditized, right? Everyone can spend three months and then get a rag to be right 60%, 70% within three months, but from 65% to 95%, it's an up-to-date battle. There's a lot of prompt engineering required to tune rag or embeddings. You're not really tuning it in the classic sense of tuning AI, but you have to kind of play with the data and iterate through. Play with the data, right? And then also the model is evolving, right? You know, you wonder, hey, this was right yesterday. How come it's different today, right? Can we have memory for that, just like, we got the right answer, store it. Yes, yes. But this is what's different, it's generative. Yeah, I think this is, you know, this is the word that was used, right? It's hard to put a control on the software. I mean, in the past, you write a piece of code if then else, right? You know, we software engineers know exactly what to expect, but this time around, right? You write a code, what you expect out of it is not something you have definitive answer for that, right? Part of that is because the entire software also depends on the data flowing, right? Not just by that piece of software you write. So I think it's a paradigm shift. How you write a software, how you test the software, how you observe and then secure the software, it's all different. What do you see that's different for people watching this, or listening to this interview? You've built things in multiple generations of tech, so we're now in this next wave. What are people doing differently from a startup building perspective? How is the engineering different? Is there anything noticeable in how people are starting projects, execution? Again, we've heard all the best practices of, hey, if you want to get a product market fit, get three engineers, no more than three, and solve a problem quick and then scale it. What are you seeing now as kind of? I brought up this concept about a year ago, solopreneur, sort of the concept, right? And I think it turned out to be more and more true if you look at some of the amazing or very interesting companies, they only have five people, right? By the time they launched the interesting product. And I think partially because of the AI, right? A lot of the low code and no code. And also a lot of that is because the foundation model is allowing you to launch something very interesting. I think the solopreneur sort of the concept is going to be more true. Again, when I mentioned the solopreneur, it doesn't mean solo, solo, right? It's much smaller. Get some great guys who can code great together. They do great magical things and you nail the product market fit. Right, right. Get the market entry, which is the hard part with the business model. Yeah, one of my friends came to me a few months ago and said, hey, let's do a startup, do this and a that. And by the way, I'm going to have certain pieces, outsource, the this way and the that way. And I told him that we need to rethink about this. If you wanted to do this, right? Maybe a lot of things that five years ago, this is the outsourcing part. Now it's actually not an outsourcing part, right? This is rethink about this, you know? I don't know what's the latest with my friend, but you know, hopefully he gives some thought about this. Yeah, Amazon had the two pizza teams, but I think more and more, I'm seeing more of, hey, give me three of us. We have an eye on Beachhead for a product market fit. Let's go code it great, test it, get validation, then reassess where they are, whether it's technical debt or more features, nail it first, get great coding rather than this big idea, move fast. So speed becomes a big part of this. Yeah, the other thing is, you know, you see some of the companies, OpenAI included, right? They only have 500, 600 people, but they didn't count that the people who labeled the data for them or the contractors. So I've been engaging a few companies, right? You know, say, hey, how many people you have? Okay, 100 or 500, however many, right? Okay, so well, we have, you know, 10 times more or five times more contractors doing, just with that one narrow thing. That's a very interesting phenomenon, right? You know, there are, you know, so many, you know, data labelers. So that's kind of a, you know, you're asking me, hey, how are you, what are you up to? You know, I'm thinking about, you know, starting my data label career at some point. It's interesting, as we get older, you look at, you know, I always say in the queue, I'm actually, I was 25 again, because, you know, if you're out there 25 years old and, you know, you see someone and she says, hey, I want to see great opportunity. He or she, this market is beautiful for an entrepreneur out there. Oh, this is good for entrepreneur. And then for the, you know, like you said, right? You know, for the younger generation, I was looking at the authors of the Sora, you know, the technical notes, right? Some of the guys just got off the PhD just last year, right? So they were able to lead such an interesting project like a Sora, so. Well, I mean, I also joke, I used the 25 reference because I wish I was 25 again. That's my personal thing, but, you know, if you look at the entrepreneur activity, a lot of systems guys from our generation are really contributing to companies because either as entrepreneurs or advisors, because we grew up in the old days of systems, we had to build everything. Now, we remember back in the 80s, you know, it was all about systems. You had to build your own operating system, you had to build your own, all your own memory controller, everything was systems level. And now I think with the cloud and now on premise with AI going distributed, that system mindset, the systems thinking is a big part of that. What's your reaction? What are you seeing? Are you seeing that translating into this new generation? Are they thinking like systems thinkers? You speak design thinking, then agile startup, then it's like, now it's more, I call systems thinking. What do you, what's your reaction? Yeah, I think, you know, the, there's a lot of room for even companies like OpenAI to do, to optimize the system so that they don't use that much computing power and they use less and they still get, you know, as good a potentially even better result. I wanna say, this time around, right, there's a little bit rushed so far in that, you know, there hasn't been enough or, yeah, there hasn't been enough of the system thinking in that large language model world at least based on what I observed, right? That's number one. Number two, ultimately, like you said, right? You know, no matter what, it's computer science, right? It's sort of the, it's the same principle we apply generation after generation after generation. So I don't have any doubt that, you know, people with the system level thinking will still have job to do more than just, other than the data labeling career. So a couple of key news I want to end out on one, Gemini from Google Cloud, huge context window, the 1.5 Pro of Gemini, which is part of the Vertex family of AI products. One million tokens. Yeah, one million tokens. That's about equivalent to one hour of video, 11 hours of audio, code based with over 3,000, 30,000 lines of code or over 700,000 words. So this is, again, the innovation we're seeing. What do you see this going next? What are you excited about in this market? Because again, as this continues to go, more and more innovations coming, more context window means more tokens are available, which has been a big problem now. It's going to open up more. Well, there are, you know, again, you know, going back to our key reason of having this conversation, right? What does that mean to entrepreneurs, right? That's sort of the world that we are rooting for. Look, you know, that means that, you know, it's harder and harder for the smaller guys to play in the game of the large language model, right? And you need so much resource, so much, you know, that foundation model. But the good thing is the foundation model, to me, is just like a new generation of the cloud. You know, you and I discussed it, right? You know, I actually pointed that out, you know, within the week after, you know, Chai GPD was released, that the biggest revolution is the cloud and not just the search, right? I think at this point, most people would agree to that. At first, the people were, oh, this is the new Google search. But guess what? A year later, being didn't move the needle at all, they may increase the 1% market share, that's about it. But the cloud is a different thing. To me, like we went from, you know, we're still in the middle of that, right? You know, the traditional definition of the cloud to intelligent cloud, right? To me, the foundation model is just to give, you know, make the cloud 10 times, 100 times more intelligent. Which means that people on that cloud can build interesting applications. But we are at that tipping point, we are at that juncture, right? Perplexity is more like a search engine there, but they're like, but that's an application. I mean, to your point, data is the ultimate thing that's happening. We're talking about here data, right? At the end of the day. Alvin is a great entrepreneur, right? You know, PhD from Berkeley. But still, you know, it's still way too early to declare victories. Yeah, yeah. Not a bad product. Again, you don't know who's going to do it. I think my overall thing is, you know, with the new cloud, right? 10 times, you know, not just 10 times, 1,000 times better than the previous generation cloud. What applications are we going to build? From enterprise point of view, from consumer point of view, I think we are still confused. When I say we are confused, we are still in the middle of searching for that thing. Well, it's like, you know, we are located, it's, we are more like, you know, late 90s of the internet, for the internet era. That everyone see the potential of internet. Clearly, you know, you can do so many amazing things. Early winners come out and- But as soon as you put the YouTube kind of workload on it, you know, you are going to suffer, right? Performances, you know? You don't have your broadband yet. You don't have, you know, the server yet, you know, and you are going to suffer. It was, you know, for the internet, it took a few more years for us to see YouTube- It's like a child saying, I want to be a professional basketball player. Well, you can hope that, but you're not tall enough yet, or I want to be a musician, pro musician. Just need to evolve. I mean, this is evolutionary, not so much, it's embryonic, it's early days. So it's, to your point, like the web. But okay, let's take that next level. What happens next? Do we see things where people have, with data get the value? I was just talking to someone about Mobile World Congress. We're going to be going to MWC in Barcelona. We're talking about 5G and the answer was, hey, that's data. It's a technology that has a lot of data around it. Well, wouldn't it be great to make that a model, like its own foundation model? There are two angles for the data, right? One is the data we already have, right? You know, clean that up, you know, massage it, all that sort of thing, right? But the other angle is actually have more synthetic data or more data generated out of the data, right? You know, meaning that, hey, I have this data, I do analysis, and then I have that piece of data that's far more useful for you, right? So I think that there is the raw data, you know, internet data, that sort of things, but then there is curated data, or the analyzed data, or the synthetic data based on the raw data. I tend to believe that, you know, moving forward, right? You know, when we look back, you know, it's the synthetic data, it's the curated data that's going to be more meaningful. Like SORAP, I don't know they give all the details, but I think that the wide speculation, that the speculation from the expert is that a lot of the data they use, there's a synthetic data, not just because, for the kind of the data you need to use to train, it's a pretty rare, it's not enough, so they got to have so much, you know, synthetic data as part of the model training. So I think that the synthetic data, the new data, or the, you know, derived data is something, I suspect, more interesting. I think you're right on that, and I would say to your point about, people think it's about Google searchers of the cloud, search is just a utility for insight, getting something fast, discovering something. Cloud-enabled data allows you to get insights, whether it's using synthetic data, raw data, data in a new way that no one's ever seen before. So I think it's, the search thing's interesting because that's just an application of the web, discovering things that's powered by, essentially, a lot of data center, but interesting experience. To that point, Elad Gilles has a tweet, he says, 20 hours ago he wrote a tweet. Working in AI right now feels like the early slope leading into the singularity. So much happening so quickly, it's got a lot of images on there. So speed is fast, unlike the web, it took a couple years, the internet days with the web from the 90s, and then it really accelerated the bubble pops in 2001. But everything, from the next decade, everything happened, that was promised, it kind of came. So where are we now? It took a decade. It took a decade, you seeing that shrink, or is it still going to be a decade? We are on the compressed schedule. I think it's going to be, this is where at an interesting, very interesting juncture. The technology is actually evolving faster than even some of the most optimistic VCs has been sort of estimating, right? Who would have thought, opening AI was able to launch Sora yesterday, this level 60 seconds, a pretty interesting video, right? Gemini is going to support one million token yesterday, right? We are just not even full two months into it, right? Just a month and a half into the 2024. I think even very optimistic VCs didn't anticipate that. So on the one hand, the technology evolved faster than even the very optimistic people estimated. But on the other hand, I also have to say, this is obvious to everyone, right? Landing that, making that a real use case, right? Making a difference, and then it's still slow and it's not there, right? Well, Howie, great to have you on. We've got news on SiliconANGLE right now, Anthropic rolling out new features for combating election information. A lot of things happening super fast. Great to have you on. See you at SuperCloud 6 and looking forward to all your panelists. And I want to thank you again for being a host panel for theCUBE. And if you're out there and you want to bring some panels into theCUBE for conversation, we're happy to host you. Of course, we're going to hear cover up to me. Continue to cover all this call Howie and me. We'll help you out anytime. So thanks for watching this CUBE conversation. See you next time.