 Welcome everyone to theCUBE's presentation of the AWS Startup Showcase, AI and Machine Learning top starters building generative AI on AWS. This is season three, episode one of the ongoing series, covering the exciting startups from the AWS ecosystem. To talk about AI and machine learning, I'm your host, John Furrier. We're joined by two great guests here, Adam Winshell, who's the CEO of Arthur and chief scientist of Arthur, John Dickerson. Tell us how they help people build better LLM AI systems to get them into the market faster. Gentlemen, thank you for coming on. Yeah, thanks for having us, John. Well, I got to say I got to temper my enthusiasm because the last few months explosion of interest in LLMs with chat GPT has kind of opened the eyes to everybody around the reality of that this is going next gen. This is it. This is the moment. You know, this is the point we're going to look back and say, this is the time where AI really kind of hit the scene for real, real applications. So a lot of large language models, also known as LLMs, foundational models and generative AI is all booming. This is where all the alpha developers are going. This is where everyone's focusing their business model transformations on. This is where developers are seeing action. So it's all happening. The wave is here. So I got to ask you guys, what are you guys seeing right now? You're in the middle of it. It's hitting the, hitting you guys right on. You're in the front end of this massive wave. Yeah, John, I don't think you have to temper your enthusiasm at all. I mean, what we're seeing every single day is, you know, everything from existing enterprise customers coming in with new ways that they're rethinking like business things that they've been doing for many years that they can now do in an entirely different way, as well as like all manner of new companies popping up, applying LLMs to everything from generating code and SQL statements to generating health transcripts and just legal briefs, everything you can imagine, right? And when you actually sit down and like, look at these systems and the demos we get of them, like the hype is definitely justified. It's pretty amazing what they're going to do. I mean, even just internally, we built about a month ago in January, we built an Arthur chatbot for so customers could ask questions, technical questions from our rather than read our product documentation that could just ask this LLM for a particular question and get an answer. And, you know, at the time it was like state of the art, but then just last week we decided to rebuild it because the tooling has changed so much that last week we completely rebuilt it. It's now like way better built on an entirely different stack and it's, you know, the tooling has undergone a full generation worth of change in like six weeks, which is crazy. So it just tells you how much energy is going into this and how fast it's evolving right now. John, weigh in as a chief scientist. This is like, I mean, you must be like blown away. It's like, you know, talk about the kid in the candy store. I mean, you must be looking like the same. I mean, she must be super busy to begin with, but the change, the acceleration, can you scope the kind of change you're seeing and be specific around the areas of seeing movement and highly accelerated change? Yeah, definitely. And it is very, very exciting. Actually, thinking back to when Chad GPT was announced, that was a night we were actually, our company was throwing an event at NeurIPS, which is like maybe the biggest machine learning conference out there. And, you know, the hype when that happened was palatable and it was just shocking to see how well that performed. And then obviously over the last few months, since then as LLMs have continued to sort of enter the market, we've seen use cases for them like Adam mentioned all over the place. And so some things I'm excited about in this space are the use of LLMs and more generally foundation models to redesign like traditional operations, research style problems, logistics problems, like auctions, decisioning problems. So moving beyond the already amazing use cases like, you know, creating marketing content into sort of more core integration and a lot of the sort of bread and butter companies and tasks that, you know, drive the American ecosystem. And I think we're just starting to see some of that. And in the next 12 months, I think we're going to see a lot more. If I had to make other predictions, I think we're going to continue seeing a lot of work being done on managing like inference time costs via shrinking models or distillation. And I don't know how to make this prediction, but you know, at some point we're going to be seeing lots of these very, very like large scale models operating on the edge as well. So, you know, the time scales are extremely compressed like Adam mentioned, 12 months from now. Hard to say. We were talking on theCUBE prior to this session here. We had the CUBE conversation here. And then the Wall Street Journal just kind of picked up on the same theme, which is the printing press moment, created the enlightenment stage of the history. Here we're in the whole nother, automating intellect, efficiency, doing heavy lifting, kind of the creative class coming back, a whole nother level of reality around the corner that's being hyped up. The question is, is this justified? Is there really a breakthrough here or is this just another result of continued progress with AI? Can you guys weigh in? Because there's two schools of thought. There's the, oh my God, we're entering a new enlightenment tech phase of like the equivalent of the printing press in all areas, it's just AI, inch by inch. What's your guys opinion? Yeah, I think you know, on the one hand, when you're down in the weeds and building AI systems all day, every day like we are, it's easy to sort of look at this as an incremental progress. Like we have customers who've been building on foundation models since we started the company four years ago, particular in computer vision for classification tasks, starting with pre-trained models, things like that. So that part of it doesn't feel real new, but what does feel new is just, when you apply these things to language with all the breakthroughs and sort of computational efficiency, algorithmic improvements, things like that, when you actually sit down in Iraq, which at GPT or one of the other systems that's out there or that's building on top of LLMs, it really is breathtaking, like the level of understanding that they have and how quickly you can accelerate your development efforts and get like an actual working system in place that solves like a really important real-world problem and makes people way faster, way more efficient. So I do think there's definitely something there. It's more than just incremental improvement. This feels like a real kind of trajectory inflection point for the adoption of AI. John, what's your take on this as people come into the field, I'm seeing a lot of people move from, hey, I've been coding in Python, I've been doing some development, I'm a software engineer, I'm a computer science student, I'm coding in C++ old school OG systems person. Where do they come in? Where's the focus? Where's the action? Where are the breakthroughs? Where are people jumping in and getting their, rolling up their sleeves and getting dirty with this stuff? Yeah, all over the place. And it's funny you mentioned students in a different life, I wore a university professor hat. And so very, very familiar with sort of the teaching aspects of this. And I will say toward Adam's point, this really is a leap forward in that, techniques like a co-pilot, for example, how just everybody's using them right now, they really do accelerate the way that we develop. When I think about the areas where people are really, really focusing right now, tooling is certainly one of them, right? Like you and I were chatting about Lang chain before this interview started, two or three people can sit down and create like an amazing set of pipes that connect different aspects of the LLM ecosystem. Two, I would say is an engineering, so like distributed training might be one, or just understanding better ways to even be able to train large models, understanding better ways to then distill them or run them. So like there's this heavy interaction now between engineering and what I might call like traditional machine learning from 10 years ago, where you had to know a lot of math, you had to know calculus very well, things like that. Now you also need to be again, a very strong engineer, which is exciting. You know, I interviewed Swami when he talked about the news, he's the head of Amazon's machine learning and AI when they had that hugging face announcement. I reminded him like when, how Amazon was like easy to get into if you were developing a startup back in 2007, 2008, and that the language models had that similar problem. It's step up a lot of content and a lot of expense to get provisioned up. Now it's easy. So this is kind of the next wave of innovation. So how do you guys see that from where we are right now? Are we at that point where it's that moment where it's that cloud-like experience for LLMs and large language models? Yeah, go ahead, John. Do you want to? I think the answer is yes, right? We see a number of large companies that are training these and serving these, some of which are being co-interviewed in this episode. I think we're at that, right? Like you can hit one of these with a simple single line of Python hitting an API, like you can boot this up in seconds if you want. It's easy. Got it. So I think that's it. Well, let's take a step back and talk about the company. You guys being featured here on the showcase. Arthur, what drove you to start the company? How'd this all come together? What's the origination story? Obviously, you've got big customers. How'd it get started? What are you guys doing? How do you make money? Give a quick overview. Yeah, I think John and I had come out of from slightly different angles, but for myself, I've been a part of a number of technology companies. I joined Capital One. They acquired my last company and shortly after I joined, they asked me to start their AI team. And so even though I've been doing AI for a long time, I started my career back in DARPA. It was the first time I was really working at scale in AI at an organization where there were hundreds of millions of dollars in revenue at stake with the operation of these models and that they were impacting millions of people's financial livelihoods. And so it just got me hyper-focused on these issues around making sure that the year AI worked well and it worked well for your company and it worked well for the people who were being affected by it. So that's, at the time when I was doing this, 2016, 2017, 2018, there just wasn't any tooling out there to kind of support this production management model monitoring life phase of the life cycle. And so we basically left to start the company that I wanted and John has his own story which I'll let you share that one, John. Go ahead, John, you're up. Yeah, so I'm coming at this from a different world. So I'm on leave now from a tenured role in academia where I was leading in a large lab focusing on sort of the intersection of machine learning and economics. And so questions like fairness or the response to the dynamism on the underlying environment have been around for quite a long time in that space. And so I've been thinking very deeply about some of those more like R&D style questions as well as having deployed some automation code across a couple of different industries, some in online advertising, some of the healthcare space and so on where our concerns of, again, fairness come to bear. And so Adam and I connected to sort of understand the space of what that might look like in the 2018, 2019 realm from a quantitative and from a human centered point of view. And so I sort of booted things up from there. Yeah, bring that apply to engineering R&D into the Capital One DNA that he had at scale. I can see that fit. I gotta ask you now next step, as you guys move out and think about LLMs and the recent AI news around the generative models and the foundational models like chat GPT, how should we be looking at that news? And everyone watching might be thinking the same thing. I know at the board level companies like we should refactor our business. This is the future. It's that kind of moment. And the tech team's like, okay, boss, why do we do this again? Or are they prepared? How should we be thinking? How should people watching be thinking about LLMs? Yeah, I think they really are transformative. And so, I mean, we're seeing companies all over the place, everything from large tech companies to a lot of our large enterprise customers are launching significant projects at core parts of their business. And so, yeah, I would be surprised. Like if you're serious about becoming an AI native company, which most leading companies are, then this is a trend that you need to be taking seriously. And we're seeing the adoption rate. It's funny, I would say like the AI adoption kind of in the broader business world really started, it's called four or five years ago. And it was a relatively slow adoption rate, but I think all that kind of investment and scaling the maturity curve is paid off because the rate at which people are adopting and deploying systems based on this is like tremendous. I mean, this is all just happened in the few months and we're already seeing people get systems into production. So now there are a lot of things you have to guarantee in order to put these in production in a way that basically is added to your business and doesn't cause more headaches than it solves. And so that's where we help customers is where, how do you put these out there in a way that they're going to represent your company well, they're going to perform well, they're going to do their job and do it properly. So in the use case as a customer, as I think about this, there's workflows, they might've had an ML AI ops team that's around IT, their inference engines are out there, they probably don't have a visibility on say how much it costs, they're kicking the tires. When you look at the deployment, there's a cost piece, there's a workflow piece, this fairness you mentioned, John, what should be, I should be thinking about if I'm going to be deploying stuff into production, I got to think about those things. What's your opinion? Yeah, I'm happy to dive in on that one. So monitoring in general is extremely important once you have one of these LLMs in production and there have been some changes versus sort of traditional monitoring that we can dive deeper into that LLMs have really accelerated. But a lot of that sort of like bread and butter style of things you should be looking out for remain just as important as they are for sort of what you might call traditional machine learning models, right? So like the underlying environment of data streams the way users interact with these models, these are all changing over time. And so any performance metrics that you care about, traditional ones like an accuracy if you can define that for an LLM, ones around for example, fairness or bias, if that is a concern for your particular use case and so on, those need to be tracked. Now there are some interesting changes that LLMs are bringing along as well, right? So most ML models in production that we see are relatively static in the sense that like, you know, they're not getting float in more than maybe once a day or once a week or they're just, you know, set once and then not changed ever again. With LLMs, there's this ongoing sort of value alignment or collection of preferences from users that is often, you know, constantly updating the model. And so that opens up all sorts of new sort of vectors for I won't say attack but for problems to arise in production, right? Like users might learn to user system in a different way and thus change the way those preferences are getting collected and thus change your system in ways that sort of you would never intended, right? So maybe that went through governance already internally at the company and now it's totally, totally changed. And it's through no fault of your own but you need to be watching over that for sure. Talk about the reinforced learnings from human feedback. How's that factoring in to the LLMs? Is that part of it? Should people be thinking about that? Is that a component that's important? It certainly is, yeah. So this is, you know, one of the big tweaks that happened with Instruct GPT which is sort of the basis model behind chat GPT and has since, you know, gone on to be used all over the place. So value alignment I think is through RLHF like you mentioned is a very interesting space to get into. And it's one that you need to watch over, right? Like you're asking humans for feedback over outputs from a model and then you're updating the model with respect to that human feedback. And now you've thrown humans into the loop here in a way that is just going to complicate things, right? And it certainly helps in many ways, right? You can ask humans to, let's say that you're deploying an internal chatbot at an enterprise, you could ask humans to align that LLM behind the chatbot to say company values, right? And so you're a listening feedback about these company values and that's gonna kind of scoot that chatbot that you're running internally more toward, you know the kind of language that you'd like to use internally on like a Slack channel or something like that. Watching over that model, I think in that specific case, right? That's a compliance and HR issue as well, right? So while it is part of the greater sort of LLM sort of stack you can also view that as an independent bit to watch over. Got it. And these are important factors. When people see the Bing news, they freak out. It's doing great. Then it goes off the rails. It goes big, fails big. So these models, you know, people see that is that human interaction or is that feedback? Is that not accepting it? How do people understand how to take that input in and how to build the right apps around LLMs? This is a tough question. Yeah, for sure. So some of the examples that you'll see online where these chatbots go off the rails are obviously humans trying to break the system but some of them clearly aren't, right? And that's because, you know these are large statistical models and we don't know what's going to pop out of them all the time. And even if you're doing as much in-house testing that the big company is like the co-hears in the open AIs of the world to try to prevent things like toxicity or racism or other sorts of quote-unquote bad content that might lead to bad PR, you're never going to catch all of these sort of possible holes in the model itself. And so it's again, it's very, very important to keep watching over that while it's in production. On the business model side, how are you guys doing? What's the approach? How do you guys engage with customers? Take a minute to explain to customer engagement. What do they need? What do you need? How does that work? Yeah, I can talk a little bit about that. So it's really easy to get started. We can have, you know it's literally a matter of like just handing out an API key and people can get started. And so, you know, we also offer alternative we also offer versions that can be installed on-prem for models that, you know we find a lot of our customers have models that do a very sensitive data so you can run it in your cloud account or use our cloud version. And so yeah, it's pretty easy to get started with this stuff. We find people start using it a lot of times during the validation phase because that way they can start kind of baselining performance of models. They can do champion challenger. They can, you know, really kind of baseline the performance of maybe they're considering different foundation models. And so it's a really helpful tool for understanding like differences in the way these models perform. And then from there, they can go into, you know they can just kind of flow that into their production inferencing so that as the systems are out there you have really kind of real time, you know monitoring for anomalies and for all sorts of weird behaviors as well as that continuous feedback loop that helps you make, you know make your product get better and observability and you can run all sorts of aggregated reports to really kind of understand what's going on with these models when they're out there deciding. I should also add that we just today have another way to kind of adopt Arthur and that is we are in the AWS marketplace. And so we are available there just to make it, you know that much easier to use your cloud credits skip the kind of procurement process and get up and running really quickly. And that's great because Amazon's got like SageMaker which handles a lot of privacy stuff all kinds of cool things or you can get down and dirty. So I got to ask on the next one production is a big deal getting stuff into production. What have you guys learned can that you can share to folks watching is there a cost issue? I got a monitor. I'll see you brought that up. You can talk about the human reinforcement issues all these things are happening. What is the big learnings that you could share for people that are going to put these into production to watch out for to plan for or be prepared for hope for the best plan for the worst? What's your advice? I can give a couple of opinions there and I'm sure Adam has. The big one from my side is again I mentioned this earlier. It's just the input data streams because humans are also exploring how they can use these systems to begin with. It's really, really hard to predict sort of the type of inputs you're going to be seeing in production, right? Especially we always talk about chatbots but then any sort of generative sort of text tasks like this, right? Let's say you're taking in news articles and summarizing them or something like that, right? Like it's very hard to get a good sampling even of the sort of set of news articles in such a way that you can really predict what's going to pop out of that model. So to me, it's adversarial maybe isn't the word that I would use but it's like an unnatural shifting input distribution of like prompts that you might see for these models. That's certainly one. And then the second one that I would talk about is it can be hard to understand the costs, the inference time costs behind these LLMs. So the pricing on these is always changing as the models change size, it might go up, might go down based on model size, based on energy cost and so on but your pricing per token or per 1,000 tokens and that I think can be difficult for some clients to wrap their head around. Again, you don't know how these systems are going to be used, Dr. Rory. So it can be tough. And so again, that's another metric that really should be tracked. Yeah, and there's a lot of trade-off choices in there with like, how many tokens do you want at each step and in the sequence and based on you off on top you can reject these tokens. And so based on how your system is operating that can make the cost highly available. And that's if you're using like an API version that you're kind of paying per token. A lot of people also choose to run these internally. And as John mentioned, the inference time on these is significantly higher than like a traditional classic even an NLP classification model or tabular data model like orders of magnitude higher. And so you really need to kind of understand how that as you're kind of constantly iterating on these models and putting out new versions and new features in these models how that's affecting like the overall scale of that inference cost because you can use a lot of computing power very quickly with these models. Yeah, scale, performance, price all come together. I got to ask while we're here on the secret sauce of the company if you had to describe the people out there watching what's the secret sauce of the company? What's the key to your success? Yeah, so John leads our research team and they've had a number of really cool. I think AI as much as it's been hyped for a while it's still commercial AI at least is really in its infancy. And so the way we're able to pioneer new ways to think about performance for computer vision, NLP, is probably the thing that I'm proudest about. John and his team publish papers all the time at NURBS and other places, but I think it's really kind of being able to define what performance means for like basically any kind of model type and give people really powerful tools to understand that on an ongoing basis. John, secret sauce, what's your, how would you describe it? You got all the action happening all around you. Yeah, I mean, I appreciate that. I'm talking me up like that. No, I'm not. Props to you. I would also say a couple of other things. So we have a very strong engineering team. And so I think some early hires there really set the standard at a very high bar that we've maintained as we've grown. And I think that's really paid dividends as scalabilities become even more of a challenge in these spaces, right? And so that's not just scalability when it comes to LLMs, that scalability when it comes to, you know, like millions of inferences per day, that kind of thing as well in traditional ML models. And I think that's compared to potential competitors that's really sort of, well, it's made us able to just operate more efficiently, right? And pass that along to the client. Yeah, and I think the infancy comment is really important because it's the beginning. You really is a long journey ahead, a lot of change coming. I guess it's a huge wave. So I'm sure you guys got a lot of, a lot of plannings at the foundation for even for your own company. So I appreciate the candid response there. Final question for you guys is, how do people, what should the top things be for a company in 2023? If I'm going to set the agenda, and I'm a customer moving forward, putting the pedal to the metal, so to speak, what are the top things that should be prioritizing or I need to do to be successful with AI in 2023? Yeah, I think so. Number one, as we talked about, we've been talking about this entire episode, the things are changing so quickly and the opportunities for business transformation and, you know, really kind of disrupting different applications, different use cases is like almost, you know, I don't think we've even fully comprehended how big it is. And so really digging into your business and understanding like where I can apply these new sets of foundation models is, you know, that's a top priority. The interesting thing is I think we're also, there's sort of, there's another force of play which is sort of the back economic conditions and a lot of places are, you know, they're having to work harder to justify budgets. So in the past, you know, a couple of years ago, maybe they sort of had a blank check to spend on AI and AI development at a lot of large enterprises that was, you know, limited primarily by the amount of talent they could scoop up. Nowadays like new, like, you know, these expenditures are getting scrutinized more. And so one of the things that we really help our customers with is like really calculating the ROI on these things. And so, you know, if your model's out there performing and you have a new version that you can put out that lifts the performance by 3%, how many, you know, tens of millions of dollars does that mean in business benefit? Or if I'm, if I want to get, you know, go to, go to, you know, get approval from the CFO to spend, you know, a few million dollars on this new project, how do, how can I bake in from the beginning the tools to really show the ROI along the way? Because, you know, I think in these systems when done well for a software project, the ROI can be like pretty spectacular. Like we see, you know, over 100% ROI in the first year on some of these projects. And so I think in 2023, you just need to be able to like show what you're getting before that's been. It's a needle moving moment. You see it all the time with some of these, these aha moments are like, whoa, blown away. John, I want to get your thoughts on this because one of the things that comes up a lot for companies that I talked to that are kind of on my second wave, I would say coming, maybe not, maybe the front wave of adopters is talent and team, team bill. You mentioned some of the hires you got were game changing for you guys and set the bar high. As you move the needle, new developers are going to need to come in. What's your advice given that you've been a professor, you've seen students. I know a lot of computer science people want to shift. They might not be yet skilled in AI, but they are proficient in programming. Is that's going to be another opportunity with open source and things are happening? How do you talk to that next level of talent that wants to come in to this market, to supplement teams and be on teams, lead teams, any advice you have for people who want to build their teams and people who are out there and want to be a coder in AI? Yeah, yeah, I have advice. This actually works for what it would take to be a successful AI company in 2023 as well, which is just don't be afraid to iterate really quickly with these tools, right? The space is still being explored and what they can be used for. A lot of the tasks that they're used for now, they're not new tasks, right? Like creating marketing content using a machine learning is not a new thing to do. It just works really well now. And so I'm excited to see what the next year brings in terms of folks from outside of core computer science who are other engineers or physicists or chemists or whatever who are learning how to use these increasingly easy to use tools to leverage LLMs for tasks that I think none of us have really thought about before. So that's really, really exciting. And so toward that, I would say iterate quickly, right? Like build things on your own, build demos, show them to friends, you know, host them online and you'll learn along the way and you'll have something to show for it. And also we're, you know, you'll help us explore that space. Guys, congratulations with Arthur, great company, great picks and shovels opportunities out there for everybody, iterate fast, get in quickly and don't be afraid to iterate, great advice. And thank you for coming on and being part of the eight of us showcase, thanks. Yeah, thanks for having us on, John, always a pleasure. Yeah, great stuff. Adam Winshell, John Dickerson with Arthur. Thanks for coming on theCUBE and I'm John Furrier, your host, Jenner of AI on eight of us. Keep it right there for more action with theCUBE. Thanks for watching.