 Hello, and welcome to SiliconANGLE News Breaking Story here at Amazon Web Services, expending their relationship with Hugging Face, breaking news here on SiliconANGLE. I'm John Furrier, SiliconANGLE reporter, founder and also co-host of theCUBE, and I have with me Swami from Amazon Web Services, Vice President of Database Analytics Machine Learning with AWS. Swami, great to have you on for this breaking news segment on AWS's big news. Thanks for coming on, taking the time. Hey, John, pleasure to be here. You know, we've had many conversations on theCUBE over the years. We've watched Amazon really move fast into the large data modeling. You SageMaker became a very smashing success. Obviously, you've been on this for a while. Now with ChatGPT, OpenAI, a lot of buzz going mainstream. It takes it from behind the curtain, inside the ropes, if you will, in the industry to a mainstream. And so this is a big moment, I think, in the industry, I want to get your perspective because your news with Hugging Face, I think is another tell sign that we're about to tip over into a new accelerated growth around making AI now application aware, application centric, more programmable, more API access. What's the big news about with AWS Hugging Face? You know, what's going on with this announcement? Yeah, first of all, they're very excited to announce our expanded collaboration with Hugging Face because with this partnership, our goal, as you all know, I mean, Hugging Face, I consider them like the GitHub for machine learning. And with this partnership, Hugging Face and AWS, we'll be able to democratize AI for a broad range of developers, not just specific deep AI startups. And now at this, we can accelerate the training, fine-tuning and deployment of these large language models and vision models from Hugging Face in the cloud. And the broader context when you step back and see what customer problem we are trying to solve with this announcement, essentially, if you see these foundational models are used to now create like a huge number of applications, such as like tech summarization, question answering or search, image generation, creative, other things. And these are all stuff we are seeing in the likes of these chat GPT style applications, but there is a broad range of enterprise use cases that we don't even talk about. And it's because these kind of transformative, generative AI capabilities and models are not available to, I mean, millions of developers. And because either training these LLNs from scratch can be very expensive or time consuming and need deep expertise. Or more importantly, they don't need these generic models. They need them to be fine-tuned for the specific use cases. And one of the biggest complaints we hear is that these models, when they try to use it for real production use cases, they are incredibly expensive to train and incredibly expensive to run in front, so to use it at a production scale. And unlike search, web search style applications where the margins can be really huge, here in production use cases and enterprises, you want efficiency at scale. That's where hugging face and natively a shadow or mission and by integrating with training and inferential, they're able to handle the cost efficient training and inference at scale, I'll deep dive on it. And by training, teaming up on the SageMaker front, now the time it takes to build these models and fine-tune them is also coming down. So that's what makes this partnership very unique as well, so I'm very excited. I want to get into the time savings and the cost savings as well on the training and inference. It's a huge issue, but before we get into that, just how long have you guys been working with hugging face? I know this is a previous relationship. This is an expansion of that relationship. Can you comment on what's different about what's happened before and then now? So hugging face, we have had a great relationship but in the past few years as well where they have actually made their models available to run on AWS in a fashion. Even in fact, their Bloom project was something many of our customers even used. Bloom project for context is their open source project which builds a GPT-3 style model and now with this expanded collaboration, now hugging face selected AWS for the next generation of its generative AI model building on their highly successful Bloom project as well. And the nice thing is now by direct integration with Tranium and Inferentia where you get cost savings in a really significant way. Now, for instance, T-RUN1 can provide up to 50% cost to train savings and Inferentia can deliver up to 60% better costs and Forex more higher throughput. And now these models, especially as they train their next generation generative AI models, it is going to be not any more accessible to all the developers to turn open. So it'll be a lot cheaper as well. And that's what makes this moment really exciting because we can't democratize AI unless we make it broadly accessible and cost efficient and easy to program and use as well. So very exciting. I'll get into the SageMaker and Code Whisper angle in a second, but you hit on some good points there. One accessibility, which is I call it the democratization which is getting this in the hands of developers and or AI to develop, we'll get into that in a second. So access to coding and get reasoning is a whole nother wave. But the three things I know you've been working on I wanted you to put in the buckets here and comment. One, I know you've over the years been working on saving time to train. That's a big point you mentioned some of those stats. Also costs, cause now cost is an equation on bundling whether you're uncoupling with hardware and software. That's a big issue. Where do I find the GPUs? Where's the horsepower cost? And then also sustainability. You've mentioned that in the past. Is there a sustainability angle here? Can you talk about those three things? Time, cost and sustainability. Totally. So if you look at it from the AWS perspective we have been supporting customers doing machine learning for the past years just for broader context. Amazon has been doing ML, the past two decades right from the early days of ML powered recommendation to actually also supporting all kinds of generative AI applications. If you look at even a generative AI application within Amazon, Amazon search when you go search for a product and so forth you have a team called M5 with an Amazon search that helps bring these large language models into creating highly accurate search results. And these are created with models with really large models with tens of billions of parameters scales to thousands of training jobs every month and trained on large model hardware. And this is an example of a really good large language foundation model application running at production scale. And also of course Alexa which uses a large generative model as well. And they actually even had a research paper that showed that they are more in do better and accuracy than other systems like GPT-3 and whatnot. So, and we also touched on things like code whisperer where which uses generative AI to improve developer productivity but in a responsible manner because 40% of some of the studies showed 40% of these generated code had serious security flaws and this is where we didn't just do generative AI combined with automated reasoning capabilities which is a very, very useful technique to identify these issues and couple them so that produces highly secure code as well. Now all these learnings taught us few things and which is what you put in these three buckets. And you have like more than 100,000 customers using ML and AI services including leading startups in the generative AI space like stability AI, AI 21 labs or hugging phase or even Alexa for that matter. They care about, I put them in three dimension. One is around cost which we touched on with Tranium and Inferentia where we actually the Tranium you provide to 50% better cost savings but the other aspect is Tranium is a lot more power efficient as well compared to traditional one. And Inferentia is also better in terms of throughput when it comes to what it is capable of like it is able to deliver up to three higher compute performance and four higher throughput compared to its previous generation and it is extremely cost efficient and power efficient as well. Now the second element that really is important is another day developers deeply value the time it takes to build these models and they don't want to build models from scratch and this is where SageMaker which is even going to Kaggle users is what it is number one enterprise ML platform. What it did to traditional machine learning where 10,000 or thousands of customers use SageMaker today including the ones I mentioned is that what used to take like months to build these models have dropped down to a now matter of days, if not less. Now a generative AI, the cost of building these models if you look at the landscape, the model parameter size have jumped by more than thousand X in the past three years, thousand X. And that means the training is like really big distributed systems problem. How do you actually scale these model training? How do you actually ensure that you utilize these efficiently because these machines are very expensive let alone they consume a lot of power? So this is where SageMaker capability to build automatically train, tune and deploy models really comes in handy especially with this distributed training infrastructure and those are some of the reasons why some of the leading generative AI startups are actually leveraging it because they do not want a giant infrastructure team which is constantly tuning and finding and keeping these clusters alive. It sounds like a lot like sounds like a lot like what startups are doing with the cloud early days. No data center, you move to the cloud. So this is the trend we're seeing, right? You guys are making it easier for developers with hugging face. I get that I love that GitHub for machine learning large language models are complex and expensive to build but not anymore you got training and inferential developers can get faster time to value but then you got the transformers data sets token libraries all that optimized for generating. This is a perfect storm for startups. John Truro former AWS person who used to work I think for you is now a VC at Madrone Adventures. He and I were talking about the generate AI landscape. It's exploding with startups. Every alpha entrepreneur out there is seeing this as the next frontier. That's the 20 mile stairs. The next 10 years is going to be huge. What is the big thing that's happening? Cause some people are saying the founder of white climate said, oh, this starts won't be real because they don't all have AI experience. John Markoff former New York Times writer told me that AI there's so much work done. This is going to explode and accelerate really fast because it's almost like it's been waiting for this moment. What's your reaction? I actually think there is going to be an explosion of startups not because they need to be AI startups but now finally AI is really accessible or going to be accessible so that they can create remarkable applications either for enterprises or for disrupting actually how customer services being done or how creative tools are being built. And I mean, this is going to change in many ways. When we think about generative AI we always like to think of how it generates like a school homework or odds of music or whatnot. But when you look at it on the practical side generative AI is being actually used across various industries. I'll give an example of like Autodesk. Autodesk is a customer who runs AWS and SageMaker. They already have an offering that enables generative design where designers can generate many structural designs for products where you give a specific set of constraints and they actually can generate a structure accordingly and we see similar kind of trend across various industries where it can be around creative media editing or various others. I had this strong sense that literally in the next few years just like now conventional machine learning is embedded in every application, every mobile app that we see, it is pervasive and we don't even think twice about it. Same way, like almost all apps are built on cloud generative AI is going to be a part of every startup and they are going to create remarkable experiences without needing actually these deep generative AI scientists and but you won't get there until you actually make these models accessible. And I also don't think one model is going to rule the world and you want these developers to have access to broad range of models. Just like go back to the early days of deep learning. Everybody thought it is going to be one framework that will rule the world and it has been changing from cafe to TensorFlow to PyTorch to various other types and I have a suspicion we had to enable developers where they are so. You know Dave Vellante and I have been riffing on this concept called super cloud and a lot of people have co-opted to be multi-cloud but we really were getting at this whole next layer on top of say AWS. You guys are the most comprehensive cloud, you guys are super cloud and even Adam and I were talking about ISVs and modeling to ecosystem partners. I mean your top customers have ecosystems building on top of it. This feels like a whole nother AWS. How are you guys leveraging the history of AWS? Which by the way had the same trajectory. Startups came in, they didn't want to provision a data center, the heavy lifting. All the things that have made Amazon successful culturally and day one thinking is provide the heavy lifting, undifferentiate heavy lifting and make it faster for developers to program code. AI's got the same thing. How are you guys taking this to the next level? Because now this is an opportunity for the competition to change the game and take it over. This is I'm sure a conversation. You guys have a lot of things going on in AWS that makes you unique. What's the internal and external positioning around how you take it to the next level? I mean, it's all, I agree with you that Generative AI has a very, very strong potential in terms of what it can enable in terms of next generation application. But this is where Amazon's experience and expertise and now putting these foundation models to work internally really has helped us quite a bit. If you look at it like amazon.com search is like a very, very important application in terms of what is the customer impact on a number of customers who use that application. Openly and number of, and the amount of dollar impacted us for an organization. And we have been doing it silently for a while now. And the same thing is true for like Alexa too which actually not only uses it for natural language understanding others, it even actually leverages it for creating stories and various other examples. And now our approach to it from AWS is we actually look at it in terms of the same three tiers like we did in machine learning because when you look at Generative AI there are, we genuinely see three sets of customers. One is like really deep technical expert practitioner starters. These are the starters that are creating the next generation models like the likes of stability AI is a hugging face with blue Maria 21 and they generally want to build their own models and they want the best price performance of their infrastructure for training and inference. That's where our investments in silicon and hardware and networking innovations where training and inference here really plays a big role. And we will continue to do that and that is one. The second middle tier is where I do think developers don't want to spend time building their own models, let alone they actually want the model to be useful to that data. They don't need their models to create like high school homeworks or various other things. What they generally want is, yeah, I have this data from my enterprises that I want to fine tune and make it really work only for this and make it work remarkable. It can be for tech summarization to generate a report or it can be for better Q and A and so forth. This is where we are our investments in the middle tier with SageMaker and our partnership with the hugging face and AI 21 and Coheir are all going to be very meaningful. And you'll see us investing and we already talked about Code Whisperer which is an open preview, but we are also partnering with a whole lot of top ISVs and you'll see more on this front to enable the next wave of generative apps too because this is an area where we do think a lot of innovation is yet to be done. It's like day one for us in this space and we want to enable that huge ecosystem to flourish. You know, one of the things Dave and Filante and I were talking about in our first podcast we just did on Friday. We're going to do weekly. We highlighted the AI chat GPT example as a horizontal use case because everyone loves it. Every people are using it all their different verticals and horizontal scalable cloud plays perfectly into it. So I have to ask you as you look at what AWS is going to bring to the table a lot's changed over the past 13 years with AWS. A lot more services are available. How should someone rebuild their or refact or replatform and refactor their application or business with AI with AWS? What are some of the tools that you see and recommend? Is it serverless? Is it SageMaker, Code Whisper? What do you think's going to shine brightly within the AWS stack, if you will, or service list that's going to be part of this? As you mentioned, Code Whisper and SageMaker. What else should people be looking at as they start tinkering and getting all these benefits and scale up their apps? Yeah. If I were a startup first, I would really walk backwards from the customer problem I try to solve and pick and choose. Bearer, I don't need to deal with the undifferentiated heavy lifting. So, and that's where the answer is going to change. If you look at it, the answer is not going to be like a one size fits all. So you need a very strong, I mean, granted on the compute front, if you can actually completely architect it, serverless, I will always recommend it instead of running compute for running your apps because it takes care of all the undifferentiated heavy lifting, but on the data end, that's where we provide a whole variety of databases right from like relational with Aurora to non-relational with Dynamo and so forth. And of course, we also have a deep analytical stack where data directly fills in from our relational databases into data lakes and data varus. And you can get value along with partnership with various analytical providers. The area where I do think that we fundamentally thinks are changing on what people can do is like the code whisperer I was literally trying to actually program a code on sending a message through Twilio and I was going to pull up to read a documentation. And in my ID, I was actually saying like, let's try sending a message to Twilio or let's actually update route 53 a record. All I had to do was type in just a comment and it actually started generating the subroutine. And it is going to be a huge time saver if I were a developer and the goal is for us not to actually do it just for AWS developers and not to just generate the code, but make sure the code is actually highly secure and follows the best practices. So it's not always about machine learning, it's augmenting the automated reasoning as well. And generative AI is going to be changing and not just in how people write code, but also how it actually gets built and used as well. You'll see a lot more stuff coming on this front. Swami, I thank you for your time. I know you're super busy. Thank you for sharing on the news and giving commentary. Again, I think this is a AWS moment and industry moment, heavy lifting, accelerated value, agility, AI ops is going to be probably redefined here. And thanks for sharing your commentary and we'll see you next time and we're looking forward to doing more follow-up on this is going to be a big wave. Thanks. Yeah. Thanks again, John. Always a pleasure. Okay. This is SiliconANGLE's breaking news commentary. I'm John Furrier with SiliconANGLE news as well as host of theCUBE. Swami, who's a leader in AWS has been on theCUBE multiple times. We've been tracking the growth of how Amazon's journey has been exploding past five years in particular, past three. You heard the numbers. Great performance, great reviews. This is a watershed moment, I think, for the industry and it's going to be a lot of fun for the next 10 years. Thanks for watching.