 All right, thanks everyone for joining us today for CNCF's live webinar, LLMs and the Enterprise Promise and Practice. I'm Libby Schultz and I'll be moderating today's webinar. I'm gonna read our code of conduct and then hand over to Michael Tannenbaum, co-founder at Mycelial and David Gleason, Senior Advisor and Chief Data and Analytics Officer at All In on Data. A few housekeeping items before we get started. During the webinar, you are not able to talk as an attendee. There's a chat box on the side of your screen. Please feel free to drop questions in there. Tell us hello where you're calling from and we'll get to as many of your questions as we can either throughout or at the end. This is an official webinar of CNCF and is such a subject to the CNCF code of conduct. Please do not add anything to the chat or questions that would be in violation of that code of conduct and please be respectful of all of your fellow participants and our presenters. Please also note that this recording and slides will be posted later today to the CNCF online programs page at community.cncf.io under online programs. They're also available via your registration link you used to sign in today and the recording will also stay on our online programs YouTube playlist and I will drop that link in the chat in just a minute. With that, I will hand things over to Michael and David to take it away for today's presentation. Fantastic, thank you very much Libby. I'm David Gleason and thank you all for joining our conversation today. You'll hear shortly from my co-presenter Michael but I'm gonna kick things off and let's jump into it. So we're gonna talk today about large language models. Obviously you have all heard a lot about large language models over the last several months and we wanna do a brief exploration of what's going on in the enterprise. How are these things being applied and how can they be applied? As well as what are some of the practical considerations and the realities of trying to work with LLMs. And then finally to make it all real Michael's gonna present a live demonstration. So everyone keep your fingers crossed. You know what they say about live demonstrations but Michael will present a live demonstration of a really interesting use case about large language models running on small devices in an edge framework. So this should be very interesting. We wanna start off by talking though about the promise of large language models. One of the things that I get asked all the time is is it hype all this conversation about LLMs that we see? And that's actually a really hard question to answer because it may well be hype. There is so much discussion and so much being written about large language models and generative AI overall that it's inevitable that there will be some degree of hype. However, I think that we're sort of missing the point if we focus on whether this hype is merited because the fact is that an ever-growing number of companies are actually using generative AI in large language models today. McKinsey did a study that was released in the summer of 2023 that said that of the organizations they surveyed earlier this year, one third of them indicated that at least one department within the organization were regularly using gen AI to support business functions. And that number is only going to be growing. So we are in the midst of this explosion in interest in generative AI and large language models. And at the same time, we're seeing so many rapid advances in the capabilities of those models that it is hard to separate the hype from the reality. I think part of that hype and part of that excitement is the fact that this is the first form of AI that most people have ever been exposed to that they can directly interact with and control. Most AI and machine learning algorithms in the past have been things that sat in models and workbenches and technology tools that were available to people who knew how to use them. But with chat GPT and its brethren, anybody with a web browser suddenly has the ability to interact with and use large language models and artificial intelligence. And so that, as we'll talk about, is both a blessing and a curse. The other thing that's very interesting here is that there has been a lot of research done over the last several months about the potential benefits of this in quantitative terms and how it impacts the bottom lines of companies. And while the estimates for the financial impact and the overall impact on the economy of large language models very widely, I've seen estimates as high as $4.5 trillion this year in bottom line impact to companies worldwide. What is emerging as a consensus is that we're in a little bit of a winner-take-all sort of race, which is to say that the companies that figure out how to successfully deploy AI in a way that impacts their bottom line, they will reap outsize rewards compared to the companies that are later to the game. So we're witnessing a little bit of a race to figure out how to deploy large language models in a corporate environment and to do it effectively. And of course, we're seeing that there are new use cases coming out every day. So it's virtually impossible to sort of get your head around all the different ways that this could be used. But we're gonna talk about a few examples or categories of what some of those use cases are. I think the other interesting thing that you'll see is the landscape is changing every day. So there's a picture on the screen from May 2023, just a few months ago. And I'm pretty sure it's out of date already. What you will be hearing about though are all sorts of different large language models. There are a lot of commercial models out there. These are the ones you've probably all heard of, things like chat GPT and variations of it like GPT-4. Google has created Palm and Bard and Burt. Company Anthropic has created Claude and Claude II and many, many more. And the landscape is shifting all the time because new variations of these models are being created. There is constantly changes in the investment in these models. So Microsoft has previously announced the investment of $13 billion in open AI. And they're using open AI's models, including chat GPT to power many of the offerings that they're releasing to the market. Anthropic announced earlier this week an investment of $2 billion from Google. That comes on the heels of a $4 billion investment from Amazon last fall. So there's a lot of investment, a lot of new developments happening. We're seeing a lot of action in the open source models base. So these are models that in some cases were initially created by private organizations and then have been put into the open source domain and some of them actually are created natively as open source coming out of research projects at academic institutions and other foundations. And then I think the other thing that often is missing from the conversation is not only are we talking about the large language models that you can directly interact with, but there also are a tremendous number of software products and tools that have LLMs embedded in them. So things like Microsoft Copilot and Microsoft GitHub Copilot and Amazon's Code Whisper and even Google has gotten in more recently to the game with something called Duet AI. And these are tools that sit inside applications that we will use every day that provide additional AI driven functionality in those applications. So the landscape is certainly huge and it is changing very, very rapidly. So with all this money flowing into the space, what is it that people are doing with this? Well, it falls into a couple of different categories. The first of them is, and I think the one that you tend to see most often are pure, I'll call them pure productivity applications. So these are things where you're using an out of the box large language model like chat GPT to do things that are productivity focused. So you might be doing things like creating a first draft of an article or a paper that you're gonna write, doing basic research with something like chat GPT. This is sort of the model that you almost call like the interim GPT model because it's not going to create a finished product, but it's probably gonna give you something analogous to what you would get if you were asking an intern to go and do some research for you. And so effectively, this is sort of a service that does both web search and summarization of web content based on natural language inputs on questions that you ask and it restructures its output as English language. Of course, the caveat to this is that most of these models have a training date. And so they're only aware of what was available in their training data set, which in many cases is virtually the entire internet as of that date. So if you were to look at a GPT, chat GPT 3.5, I believe it's training date is something like September 21st, 2022, although don't quote me on that date, but the point is you've gotta understand for each of these models how recent the information that they are aware of is. We're also seeing a lot of things like automatic meeting annotation, document summarization, email summarization as well as email composition. And we're seeing these things not just used directly through the interface of a chat GPT or a barge, but also increasingly embedded in other tools that we use in email tools, in productivity tools like Notion, for example, et cetera. And so these are really a whole suite of use cases. Looking at how do we make ourselves as workers more efficient and time-saving. And I'd like to think that a lot of this is around amplifying human productivity, rather than simply replacing people with machines. But this is sort of the, I would say, where most organizations have started in looking at generative AI. As we get a little bit further up the learning curve, organizations are also looking at how can they provide better customer experiences with Gen AI. And so this is something we're seeing a lot of use cases around things like customer self-service. So imagine the type of questions that a customer might normally ask a call service representative or a systems engineer about. They can ask that question in their native language to a chatbot that's powered by a large language model and have that chatbot give them context sensitive responses. So this is certainly a level above what you would get if you were just looking through an online knowledge base because you're getting answers that are tailored to the questions you're asking. And in some cases, you could argue that sometimes you're getting more timely and more precise answers than what you might get through a call center response if you're talking to a human being. Certainly it, in some cases, can improve the customer experience by doing that. And it frees up the call center people to respond to more edge cases that are something where you have to have a human in the loop and apply human judgment. One of the other interesting examples is imagine things like being able to have a large language model that could have a conversation with a customer about something very complex, like simple, a large commercial insurance policy rather than that customer having to spend hours looking through what might be a several hundred page policy document, they could simply ask questions, they could identify themselves and ask the chatbot, is this type of loss covered by my policy? And if so, what are the limitations? And in theory, that large language model could analyze the policy and provide English language responses. So we're seeing an increasing growth in how LLMs can be used to provide a better customer service. And a lot of that, again, is not so much about replacing the humans, but as creating an alternative route provide customers with another way to interact with the company and another way to get information, giving them the choice to interact in a way that's most convenient for them. I think if we then sort of climb up the ladder in terms of capability, the next thing would be a set of domain specific applications. So these are applications that use an LLM but where that model has been trained or tuned specifically on the business domain or technical domain that the model is dealing with. So this is moving way beyond the concept of an out of the box model. And we'll talk in a few minutes about some of the ways that you can get there. But these tend to be very specialized use cases like contract generation. There are legal departments and industries around the world that are looking at ways that they can augment the work that lower-level, entry-level attorneys often have to do which is reformatting contracts, summarizing contracts and extracting the key points from a contract so that they can consult with senior attorneys. These are things that can be fairly effectively in some cases by trained large language models. Or another example is imagine if you had a large language model that could come through thousands of pages of documentation about proposed regulatory changes affecting your industry and could highlight what those changes are, or summarize them in an easily digestible way and could even compare those regulatory changes to your current policy documents so that it could highlight for you what the potential impact of a regulatory change would be on your corporate policies. So there's a lot of these very interesting domain-specific use cases that we see evolving over time. And I think this will probably be one of the big growth areas for large language models. One of the concepts we'll talk about in a couple of minutes is the idea that to the extent that you're using an off-the-shelf model, you're not really creating a sustainable competitive advantage because anybody else could use that model in the same way and could get a similar set of results to what you're getting. So as you go from productivity enhancements that use off-the-shelf models to more customer-facing and then domain-specific applications, you'll need to have a way of training models to be aware of your business context, your past business performance and documents and knowledge and factor that in so that you're creating a large language model capability that really reflects the knowledge and the intellectual property of your organization. And so that will become one of the big challenges that we will talk about. And of course, everything we've talked about so far has sort of been around the human interaction with large language models. And that certainly is where we see most of the action today. But increasingly, I think that we will see large language models powering B2B interactions. So imagine, for example, a contract negotiation in which one or even both sides of the contract negotiation is initially handled by large language models because you could have a large language model that has been trained on what is acceptable legal language and contract terms for your company. And it could be ingesting and interpreting a proposed contract and responding with proposed changes based on your preferred terms. Now, obviously, you can imagine this going a little bit too far where you could have two machines arguing back and forth forever. So clearly, this is an example where we're not talking about turning a process entirely over to machines, but we're talking about letting large language models augment what the humans are doing, but keeping a human in the loop at all times to really moderate the process and decide when the large language models have done as much as they can do and it's time for the humans to step in. But I think we're going to see more and more of this B2B interaction where one or both sides of that interaction may at times be serviced or supported by a large language model. One of the other ones that is interesting for any of you who have ever had to respond to an RFP, a request for a proposal, you know that those often can come out with dozens or even hundreds of pages of detailed requirements and specifications and qualifying questions and lots of other data. And it is very painful and very time consuming sometimes to read through those. But if I had a large language model that was trained on the last 100 RFP responses my company had prepared, it could probably do a very credible job of reading the new RFP, combing through it, extracting the questions and then generate proposed answers from my historical corpus of RFP responses. So there are a lot of cases where we could see this really happening to support B2B interactions in a way that really removes humans not entirely from the loop but puts them in a role of supervising and then stepping in when the time is right for them to do so. And then finally, one of the biggest use cases and this is one of the, I think the most exciting is code generation. This has got a lot of people excited and it's also got a lot of people somewhat concerned and maybe scared about what this really means in practice. So one of the interesting things that we'll talk about in a couple of slides about large language models is the quality of the output is directly corresponded to the quality of the data on which they are trained. And so if you look at large language models that were trained by crawling the internet, well, I think we all realize that the quality level and the accuracy level of data that you find on the internet is, we'll call it very uneven. There is information that is high quality and there's information that is of very suspect quality and there's even a fairly large amount of information that is intentionally incorrect that can be found on the internet in general. However, at least so far, if you look at the amount of code, of actual software code that's available on the internet, most of that code, especially in moderated open-source projects and repositories, for example, most of that code actually works as intended and is fairly high quality. So large language models have been able to arguably understand code even better than they've understood the English language and can generate some very, very nearly accurate, nearly good, runnable code. And that's largely thanks to the consistency and the overall high quality of the source material on which they were trained. So what this means for a lot of organizations is not, again, not that we're going to be replacing software engineers with large language models, but software engineers will be using LLMs as one of the tools and they'll be able to use this to help them, for example, translate an application from an old language to a new target language that they want to replatform an application in or they'll be using it to create some lower value pieces, less critical parts of an application. It also means that we'll be able to see less technical employees developing applications with the help of a large language model and less involvement from senior engineers. Now, obviously the level of complexity and the level of risk involved in the application will still dictate how much you have senior engineers involved, but there will be opportunities for business people to generate code with large language models that is nearly ready to run, which is when you think about it, not too dissimilar to the low code, no code movements that we have today. However, it opens up new avenues of generating code that can do things that a canned low code or no code product probably can't do. But at the end of the day, we'll still have human software engineers who would have the ultimate pen on writing code and they will be largely deployed on the high criticality projects, using LLM as a tool to help them, but certainly we hope not being replaced by large language models. There's one more pattern I think is emerging and we're seeing early indications of this and this is the concept of chaining large language models together. So one of the trueisms of large language models is no single large language model is good at everything. Some large language models aren't good at much at all and some are good at several things, but they all have strengths and weaknesses. So one of the things that people are experimenting now is what if you were to combine some of those together in a chain and this can work in a couple of different ways. One of the ones that we've seen already be launched is direct linear chaining of large language models. So for example, you have, there's a product that Google's Jigsaw incubator team created that was called Perspective AI and they created this product to moderate human responses on online forums to weed out offensive language and inappropriate language and things like that, but to do it in a much more sophisticated way than simply a keyword filter would, for example. What they have now done is they've actually put this as a moderating layer on top of other large language models so that it can help ensure that that large language model does not violate its own guidelines and policies and inadvertently create offensive or inappropriate information. So this is actually being used. The Perspective AI model has been used by Google, Meta, OpenAI, Anthropic and others where the output of their large language model is then fed through Perspective AI, which moderates the content and if it passes the inspection of the Perspective AI moderating layer, it's then passed on to the user, otherwise it's suppressed and the user doesn't see that answer. So this is really interesting and there's experimentation going on right now with actually using multiple large language models almost in a competitive form where different models are given the same prompt or the same query and then a model sitting on top of them gathers up the responses and compares them and picks the one that it feels is the best response. So we're gonna see a lot more work around how these different models can be combined to add value in new ways. So all of this sounds really good, right? I mean, who wouldn't love to have large language models that can help you with every aspect of your job and make us all more productive and our companies make more money? I mean, what could go wrong? Well, here's where the reality chip comes in and we don't have time obviously to talk about all the downsides of large language models but let's talk about a couple of key things. Number one, they're not infallible. No matter how convincing they are and how well articulated their responses, we have to be aware that large language models can hallucinate, that is, they can say things that simply aren't true, they can give us biased answers and at the root, there is no concept of truth or accuracy in a large language model. I think one of the people who said it very well was a woman named Kazi Korzikoff and I probably mispronounced her name but she was until very recently the chief data scientist for Google and she said, AI is not a person, it's just a pattern-finding thing labeler which is a pretty interesting way of putting it. It really just builds strings of language out of patterns it has learned from previous strings of languages looked at and it has no concept and no mechanism of ensuring those strings of language are anything other than well-formed languages so there's no guarantee that they're going to be correct and I won't spend a lot of time citing examples but if you look up large language model hallucinations you can see scores and scores of well-publicized examples where large language models were confidently incorrect. Their output was phrased and articulated very clearly, it just wasn't correct. So that's one of the challenges. The other part of the challenge is that large language models were trained as I mentioned earlier on publicly available documents and we have to pin an asterisk to that phrase publicly available because there are legal challenges to how some of those documents were used and maybe in ways that violate the intellectual property ownership rights of the authors of those documents but a large language model is based on massive volumes of data but it doesn't know anything about your business specifically except to the extent that your business, your public documentation might have been ingested during the training phase of that large language model but it certainly is aware of your company specifically and as I mentioned earlier, therefore you've got no inherent advantage over other people who are using the same large language model as you and that's something to be very conscious of because simply using an LLM isn't going to create lasting business advantage and it also creates a whole set of risks and those risks have not really been spoken about as much as maybe they should what we're seeing in the last couple of months though is a real growth in the concept of responsible AI and really starting to try to understand and manage some of those risks but those risks we can talk about in a moment but how do you get value out of this and given all those potential limitations what do you do to actually get value? Well, a couple of different things I think one of the most basic things is first of all, know how to identify and qualify a large language model use case and I think it's very important to remember that while LLMs are getting a lot of attention and are dominating the conversation they are far from the only useful form of artificial intelligence and in what must be one of the most awkward phrases of 2023, we have to remember to look at traditional AI in addition to generative AI and so yes, it is sort of weird to call artificial intelligence traditional but the point is that LLMs are just one form of artificial intelligence and what we see is for most companies the number of non-generative AI use cases in other words use cases that aren't going to use a large language model those use cases are far more numerous than the large language model appropriate use cases so very critical to know how to identify what is a valid use case for a large language model and part of doing that is also understanding which model you're going to use as you saw earlier, there are literally hundreds if not a thousand different large language models available in private domain and public domain and one of the things that we really need to understand is how to match the model to the problem one of the idiosyncrasies of these models is that people tend to train them and test them against a number of different benchmarks and there is a constantly increasing set of public benchmarks that you can benchmark a large language model against and of course, like all of these things what we tend to see is as performance goes up against a particular benchmark for a model that model also suffers in performance on other benchmarks so you end up with a situation where you've got some models optimized for some benchmarks and other models optimized for other benchmarks makes it pretty challenging so understanding what the benchmarks are and knowing how to identify a model that will work with your bench work with your use case is a big part of the battle and then finally, I've hammered home a number of times the point that a generic model isn't gonna give you a lasting advantage so you need to take steps to control, modify and customize the way that that model behaves and so these are a whole other conversation about how these are accomplished but if you look at things like prompt engineering which is a fancy word for really knowing how to ask the types of questions that you wanna ask of the model and make the model give you the kind of answer you want through fine tuning of a model which is presenting additional information to the model that will change the manner in which it responds you can change the types of language it uses you can change the way it expresses itself through fine tuning there's additional forms of things like context injection which is where you are adding additional documents that have been analyzed to the underlying model of the underlying foundation model so you're basically expanding the knowledge base if you wanna call it that of the model and then finally, in some cases, custom model development which is completely training a new large language model on your own data now the value of these as we go down that list they get more and more effective but they also get considerably more time consuming and expensive so custom model development is not something that's accessible for most use cases but there will be situations where it's appropriate and then last but not least and probably the biggest and most fundamental thing that you can do is pay attention to the data that you're using with a model so whether you're fine tuning a model you're doing context injection or you're doing custom development the data that you are using to augment your model absolutely matters the model's only as good as the data that it's trained on and this is the hard reality of this and putting data into a model is really complicated so we have problems with data quality which is the accuracy and consistency of the data that you want to introduce to the model as well as data availability and the availability problem I had a conversation with someone not long ago and the challenge was okay we want to identify good contracts and we want to train a data model on these good legal contracts that we've done we know we have over 10,000 good contracts how in the world are we going to find them and get them some of them are in SharePoint some are in an email some are on file shares some are in a contract management system some are really good some are kind of good and some we wouldn't want to replicate and so just the availability and accessibility of this data becomes a major expense of stumbling block and one of the paradoxes of this is if you look at data management as it's been practiced over the last decade in most organizations the focus has been on databases and structured data making sure our transaction data is right our customer master data is correct, et cetera we have traditionally in most industries put less emphasis on the quality the structure and the availability of this unstructured data of documents that are spread on document management systems and SharePoints and emails of textual data of audio data, video data, et cetera this tends to be harder for us to get our hands on and to know where it is so one of the biggest challenges and expenses in creating real lasting value out of large language models is going to be learning to apply data management principles to this whole set of unstructured data and I think that the really important thing to remember is if you're using someone else's model you really cannot know what it was trained on even the people who developed the model do not know at a detailed level what data that model is trained on because there's simply so much of that data so training a model or customizing a model using your own data is really something that needs to be taken into consideration now, I referenced earlier a bunch of risks again, I won't go through these in great detail but there are a huge number of evolving risks I'll call on a couple the economic sustainability of large language models is something that's getting a lot of attention right now because even at what can be some relatively high costs for access to models it isn't really going to be economically sustainable so companies like OpenAI and Anthropic and others are arguably losing money on every API call and are having to constantly get injections of cash in order to develop new models and enhance their models and so at some point these companies are gonna have to come up with a business model that is sustainable and that may indeed mean that the compute charges that we pay for access to these models will have to go up I obliquely referenced earlier that there are a lot of challenges being placed around data privacy and copyrights and what that means for your ability to train model on data that is available to the public certainly security concerns something we're seeing a lot of is responsible AI bias, fairness, transparency in the ethical application of artificial intelligence and in fact we had the executive order in the United States yesterday issued by the president that asks for a number of different federal government agencies to take steps to increase responsible ethical AI usage in the United States we're dealing with a very uncertain regulatory landscape we have again the fallout from yesterday's AI executive order we have the pending EU AI Act legislation and any number of other regulatory and legal proposals that are working their way through various approval mechanisms which makes it very hard for organizations to truly predict what their obligations will be and what the future risk of using AI will be so these risks are not insubstantial and definitely are something we have to take into account the Pew Research Organization did a study a couple of months ago and they found that 70% of the respondents who had heard the term AI had quote little to no trust in companies to make responsible decisions about how they use it in their products four out of five of them said AI will probably lead to their own personal information being used in ways that they don't intend it to so there's a high degree of skepticism and the potential for reputational harm in using AI the very last thing I'll talk about before I turn it over to Michael to go through an interesting demo is we also have to pay attention to how it impacts people in the workforce and so the goal that we would all like to see is using artificial intelligence to amplify human potential rather than replacing people but in order to do that that means we are going to have to manage what will be a profound impact on the workplace on the skills that people have the types of roles that people play and how we train them and equip them to interact with or create or moderate artificial intelligence solutions so I hope that was a reasonable overview of some of the types of challenges that we see as well as the potential, the very high potential in this I am going to turn it over now to Michael who is going to pick up the mantle from me and Thanks David, hi everyone appreciate everyone tuning in today I'm gonna begin with a very quick discussion of what we're about to see and I don't know if anybody noticed but all of the images in the slides that David and I put together were actually produced by Generative AI and if you look through them I think they tell actually a very interesting and a good story about some of the limitations that David was referring to so if you look for example just very quickly at this image Generative AI, this is why we jokingly refer to it as intern GPT it knows that warning kind of makes sense in this context but if we look over here there's a variation on what warning looks like and is sort of misspelled if you look at for example just very quickly this image so all we did when we were making the presentation was take the content for the slide put it into chat GPT and we can notice that chat or excuse me then to Dolly and Dolly picked two white guys wearing suits to represent B2B potential so these sort of subtleties make a lot more sense I think with the kind of context that David provided which to say that your model is only ever as good as the data it's trained on so today we're gonna look at a use case that we hear a lot about at MyCL which is that oftentimes very complex mechanical things live in remote places and it's oftentimes very challenging to bring all of the maintenance information you can imagine the tens of thousands of pages of spec and 3D models and everything for complex units in this case we'll be talking about jet engines so the opportunity that we're looking at here is how can we create a large language model that could help airplane maintenance technicians who are deployed at remote airports or airfields or air bases and those air bases and airfields may not always have a very consistent connection certainly not enough connection to be able to exfiltrate and then query tens of gigabytes or even tens of terabytes of flight data and performance data. Now layer on to that challenge of actually accessing the data like David was talking about is also the ability to interact with it so generally speaking folks who are in the jet engine maintenance profession are also not also co-employed as data analysts so wouldn't it be nice if there was a natural language mechanism by which those individuals could get access to the full robust data set of performance metrics but interact with that in a natural language capacity. Oops, let me fix my zoom. Fit to screen, there we go and I'll hit the right key. All right, so we talked a little bit about the objective the goal is to help technicians understand engine performance using natural language and we talked a little bit about some of the cases that some of the constraints the complications that we want to account for in this context. One is that if there's no internet or very limited bandwidth, it needs to remain operational. So this is something that we want to be running quote unquote on the edge. It is also a major part of our planned training and sustainment and maintenance model as an organization that makes airplane engines. So as our model improves, we want to be able to ship those updates out to the field. So the model has to be updatable. And then we also want to be able to update our models using performance data, not just of the engines and the raw underlying data associated with engine performance, but also on how technicians are interacting with the tool itself. So we want to be able to see what kind of queries our technicians are writing to our chat bot and what is the nature of the response that comes out of it. And there's lots of very interesting and sophisticated techniques to be able to take these kinds of outputs back at headquarters and compare them against more powerful but more expensive to run versions of our local model and then using any discrepancies that we find to further enhance the quality of the model. So for example, this would be sort of akin to having a supervisor at a call center listen in on some calls. That way the supervisor can coach the call center rep on how to better handle customer inquiries. This is the same idea, but layering that data into a training session instead. So the architecture that we're gonna be looking at to facilitate this exchange, which is we want a technician to be able to write a query in English that says something like, which engines went above 200 degrees in the last day? And we want them to be able to get a natural language output as though they were speaking to somebody, something like engines two, three and four. Okay, that's helpful. So what's the architecture of the solution? Well, the technician is gonna write something. And then again, we wanna be able to have that available for our training processes later. So we're gonna wanna cache that. We're gonna have to take that plain text query and put it into some model framework that will convert English into code that the database, the SQLite database can understand. And then we have to pull those results out of the engine performance database. We then have to format those results into something consistent and then write that. Again, we wanna be able to compare inputs and outputs for later training. So we wanna cache those results and then forward them again to the chatbot client for the user. So let's take a look at the actual technologies that we used for this solution to be able to demonstrate this today. So we're using two pieces of technology and I'll describe them both very briefly before moving into a wider discussion of the hardware and some of the other underlying technologies here. So we used a tool called Backyao which is produced by a company called Expanso just headed up by David Oronshik who is one of the great thinkers in the machine learning space. Any talk by him is worth checking out on YouTube for sure. And the Backyao platform that they've developed if you're thinking about some of the use cases that David was mentioning, how do I aggregate all my data when it lives in a bunch of different places? Or maybe I don't wanna aggregate that data. I just wanna be able to run some query or some aggregation on that and then centralize just the results. Backyao is really the compute, they call themselves and they think it's an appropriate term, the compute over data framework. So Backyao gives you a mechanism to do compute jobs, long running jobs over very disparate nodes where your data actually lives. So all of the headache of centralizing me and adjusting for different regulatory regimes, getting permissions and access to that data sort of goes away. In this case, we have an edge node and we're using an NVIDIA Jetson or NANO as the node and we're using Backyao to run the ML model. And I'll encourage you again to go check out backyao.org because it's not just the LLM model that you'll see here today, but there's a whole host of other including generative AI use cases that you can get up and running right away. The other core technology that we've used for this demonstration is the product by the organization that I work for, Mycelio. And Mycelio is orchestrating the databases or the data stores that exist on this implementation. So when the technician submits something to the chat client, it gets written to a SQLite database. When the model finishes understanding, finishes its query results, that gets written to a SQLite database called results. And we also have an engine database and that could actually be taking in live data. There's no reason that it needs to be turned off. We can even execute this as more data is coming into the system. And this is also a SQLite database. And what Mycelio is doing is defining how the behavior of data should flow in this context and also providing a mechanism for exfiltration of that data so that we can do improvement training on it. So when the data gets written to the database, Mycelio recognizes that there's a new row and passes it to the backyard part of the pipeline. And then when backyard writes the results to the results database, that is also picked up by the chat client as well. And so all of the data as it moves through these areas that you see here are in a consistent messaging format. And what that allows us to do is define exfiltration pipelines using Mycelio that say anytime this changes, when there's connectivity, please move it to my Redshift instance in the cloud or my Snowflake instance or something like that. So through the use of these two tools in conjunction, we get both the ability to serve the model on the edge that is to have a robust, upgradable, and monitorable, restarting, auto restarting, auto self-healing, all of that stuff for the actual model itself so that there is something that we can give it, we can put data into it and get results from it. And there's Mycelio, which is orchestrating the data flow through the system on the device, as well as providing a means to getting that data off the device. I'll just take one moment to speak a little bit about the hardware here. So we are using in this case, a NVIDIA Jetson Orin Nano, which is an extremely capable device for most edge AI use cases. They're only about 500 bucks. And this is just a regular sized ethernet port, a regular sized USB port. You can see the relative scale here, they're quite small. And these are indeed used in industry for the onboard computing needs of drones, for example, that need to do computer vision, et cetera. So more than capable for a use case like this and half the price of an iPad. So very cost effective in that regard. I'll just put up the two tools that we saw here again. Definitely check out backyout.org for all of the very interesting use cases associated with not just LLMs, but a lot of different generative AI. We also should note that the model that we're using here is paraphrase mini-LML6V2. If you are interested in getting a hold of this reference architecture, you can just shoot me an email michaelomysilio.com or check out our Discord. So review, we're gonna see, I'm gonna do the demo and we'll see that I'm providing plain text queries that's gonna get written into a inputs database that's gonna get translated into a SQL query via the ML model that we discussed just on the last slide. And then that data is gonna get written to our results cache. All right, so I will stop sharing and then pop over to share my terminal. And David did mention, of course, that we have to keep our fingers crossed for any kind of live demo. All right. All right, so everyone should be able to see my screen. And what you can see here is that I'm logged in to the Jetson, so this is a little device that's plugged in in my office behind me. And we can begin by asking some very basic questions. So hopefully this would be visually confirmable, but we could ask something like how many engines are there on this aircraft, might be a good question. And the LLM will take a second to think about it and it will produce an answer for, haha. Demo screen is blank, screen unable to see. Same. Okay. All right, I can see my terminal in the wind. Okay, now there it is. All right, got it. All right, thanks folks. All right, so you saw the first query, how many engines are there on this aircraft? Boom, we got a solid answer, there's four. All right, we should be able to take a look at that. All right, now maybe as a technician, we think maybe the pilots were goofing around. So let's see if they did anything unexpected, like which engines were off today. And we can run that query, we can see that engine two was off and then it was off again at a different point. So again, the quality of the query is of course dependent on training data. So this might be an opportunity where someone back home seeing this result in headquarters might say, ah, I wanna clean this up. I wanna tidy this type of query up. I'm gonna do a little tuning of the model to make that happen. But we can go even further and ask which engine attained the highest thrust. And the model will take a second to think about it. And we can see here almost similar to the way that the images or intern GPT, that sort of concept, we wanted to know which engine attained the highest thrust. And it returned to us a ranked order, excuse me, of thrust for each of the engine. So it did answer our question, but perhaps not in exactly the way that a human might, if we were to ask that. So we can go a step further and make it even easier for someone to introspect and understand change over time. So we might be able to say something like, show me the average thrust for all engines in the last hour. And we can hit enter and we see, oh, okay. So this was the average thrust, but wait a second, maybe it makes sense, I actually wanna go back two hours. So let's do that. Show me average thrust for the last two hours. So now it's gonna go and it's gonna do the calculation, figure out what I mean by that. And then we can see here that the results are a little bit different. So it took an average over two hours and that was a little bit different. And we can even expand that query again and see that the numbers will change in terms of the last 12 hours. So the process you can see here is converting this into a SQL query. It's querying over a million rows in our local database. And the result is, as you see here, we're able to interact with it via a textual interface. Again, huge thanks to the team at Backyow, particularly Ross for helping us put this demo together. And we're really excited to start operationalizing more and more of these use cases as David was describing earlier. And with that, I will switch back over to the closing screen. If I can find my slides, here we are, okay. And thank you very much. So happy to answer any questions in the remaining few minutes that we have. Again, we really, really appreciate everyone for attending rather than no SQL VectoredDB like in many other solutions. Yeah, so that's a really interesting point. In this case, because of the popularity of SQLite, how's about 99% of the market, oftentimes people will use SQLite in a way that is not relational at all. So in particular, it's very popular for things like monotonic sensor data, which is immutable and ever-growing. So it works almost like an append-only log, but SQLite is the only database that is, for example, cleared to go on aircraft by the FAA here in the States, et cetera, et cetera. That being said, the movement of features that would normally be stored in something like a VectoredDB from the cloud out to an Edge device is a really, really interesting use case that we are actually would be happy to talk to you more about because that's a connector that we're looking to incorporate. There's a few folks out there, Nuclea I think has a VectoredDB now that's quite snazzy. So Srinivas, if you're interested, please send me an email or jump in our Discord. Any other questions? Hey, David, thank you. All right, any other questions very quickly? If not, I will pass it back over to, was the model pre-trained with the data or based on data available in SQL, it fetched the answer? Yes, so it is based on the live data that is coming into the SQLite database, Sunil. Great question. And the model, what we're doing here and to David's point about making models more context specific is we're using a model that understands the schema of the database and understands natural language for the purposes of turning natural language into SQL queries. One of the demonstrations I did not do, but I was thinking of is if I asked the demo that you saw just now, please explain Tibetan Buddhism, it would not do a very good job. It would ask us to send a new, can you, yes, absolutely. And in particular, Bakiow has great supporter IPFS, by the way. So this is a very limited model. All it knows how to do is take language and turn it into queries. So this is one of those use case specific, but you can see this is something that anyone with no training can do to interact with their data. And that's really exciting for me. All right, back to Libby. Anyone else have any questions? Going once, going twice? All right, I think we'll wrap it up here. Thank you, Michael and David so much for your talk today. Thank you everyone for joining us. We had a great turnout. Like I said, this recording will be online later today. So if you missed it, check it out there. And thanks again for joining us. We will see you next time for the next CNCF Live webinar and hope to see everyone at KubeCon next week in Chicago. Thank you both.