 All right, thank you Candice for the introduction and having me on the webinar. Good morning, good afternoon and good evening to you all from wherever you are joining. I'm Dhoni Dhanushka, I'm your host today and I welcome you all to the webinar. Today in this webinar we are going to discuss the role of streaming data in generative AI. So we'll be looking at how streaming data in generative AI can play nicely together, how they complement each other and most importantly as developers and architects, how we can extract the best out of both streaming data and generative AI to build more improved and better real-time data applications. A little bit about myself Candice has already introduced me. So I'm Dhoni Dhanushka, senior geographer at Red Panda. For those who are new to Red Panda, it's a streaming data platform API compatible with Apache Kafka. That means if you already have a producer or a consumer that is working with Kafka, you can seamlessly work with Red Panda. So we'll talk about Red Panda at the later section. I'm a solutions architect and developer advocate who's in the background in the data, stream processing and large-scale event-driven architectures. So my Dejo Bezzo developer advocate at Red Panda means helping you, developers like you and me to learn and adapt Red Panda to build scalable and faster real-time data applications. If you are interested, you can follow my work on Twitter, Medium and LinkedIn. And if you already have any questions after this webinar, you can reach out to me and say hi in any of these channels. All right, let's get started. And you can check these links posted in the chat for this thing. All right, what are we going to discuss today? You know, this generative AI has been a decade-old technology. And it has been evolving over the couple of years. And specifically in the last year, it went viral and it exploded. You know, these interesting projects like OpenAI, chat GPT, Dali, mid-journey, hugging phase, and you name it. And all these projects, they have taken the concept of generative AI to the masses like people like you and me. And then everyone, including developers, decision makers, business, they started going after this concept of generative AI and hoping that that will help their business to perform well and better. So I can recall this moment much similar to the situation we had a decade ago when the big data sync came into the picture. So everyone believed that big data is going to solve every of our problems. Let's see whether it is true with generative AI or not. Having said that, the goal of today's webinar is to understand generative AI for some extent and then see what are the opportunities that we can use our business data as enterprises to work with generative AI and extract meaningful insights from generative AI to build better applications and leveraging streaming data in the process. So that's the plan for today. And then just to structure our session, I came up with this agenda. So first, I'll give you a brief introduction to generative AI. If you are a complete beginner to generative AI, don't worry. I'll cover the foundations. So I'll give you the basic foundations of generative AI. And then we'll talk about the million dollar question today. That is how you can use your business data with generative AI. So there are several challenges and we'll go through each one and then we can measure their impact. And then we'll gradually bring in streaming data, how we can use streaming data to with generative AI in combination. So we'll be talking about a couple of prompt engineering techniques and fine-tuning techniques and whatnot. And then finally, we can talk about some real-time use cases that are already being powered by generative AI and also some use cases that have the potential to get benefits from streaming data. And then finally, we can take a couple of questions from the audience and then we can wrap up. So that's the plan for today. We can continue. First, let's discuss what is generative AI? What does it mean? So generative AI is a subfield or a subdomain in the artificial intelligence which has the ability to generate new content based on the user input. So the easiest way to understand this is to use with visual representation. So let's focus on this box in the middle, in this diagram. So here we have the black box saying generative model. So in the heart of generative AI, we have machine learning models. So that is represented by this generative model. And this model has connected to this brain. And this brain represents the dataset of the cognition that this model has been trained on. So there are so many ways you can train, build and train machine learning models. I'm not going to give you a deeper dive into the specifics first. And then we as humans and users, we use input data towards this generative AI model. We as a human, we interact with these models through natural language interface or NLP interface. And most of the time, this interface is called prompt. And we instruct this model to perform something or generate something new by giving a prompt. So this prompt is written in natural language like in plain English or could be Italian or Spanish. It doesn't matter. And once the model received this input, it can synthesize the output based on the patterns and examples it has learned from this training dataset or from its wisdom. And then it generates the output containing some sort of a text, textual answer, an image, a piece of music, videos and so on. So there are so many ways we can do these outputs we can do. And then when it comes to the commercialization and marketing. So mainstream media outlets, internet and social media platforms have made us believe that generative AI is all about chat GPT and few other tools, but it is wrong. Generative AI is more about these tools and it goes beyond that. It is something like when we talk about electric cars in general, Tesla does not represent all the electric cars. So there are so many other electric variants coming from vehicle manufacturers like BMW, Mercedes Ford and they all have their electrical variants. So like it is very similar to this domain. So like I said, if we have a broader look at the generative AI composition, the foundational layers, we can identify a purposefully designed couple of machine learning models. The one I mentioned the chat GPT in the earlier slide. So it belongs to this transformer model, so it can transform the user's input into something new. So basically, it all from this LSTM and LLM, so large language model, I'll be sprinkling this term throughout the presentation. So just keep that in mind, large language model or LLM for short. So right now we have models like Da Vinci, GPT-3, GPT-4 kind of things and then apart from that, we have few other models like RNNs, GANNs and so on. I don't want to scare you by sprinkling this jargon into the presentation, but just keep that in mind when you're dealing with generative AI, we will be working with these kind of models in particular. And then we go to the business side of it. We can see a vibrant, diverse and fast-growing ecosystem of generative AI companies. When you're reading news or browsing the internet, you might have seen a couple of funding announcements for generative AI startups like AI based something. So there are so many ways AI can help this ecosystem. So for example, we have AI based technologies that helps you write better, especially for marketing people. They can summarize some texts and they can generate some copies for websites and they can summarize an email. So they can add a tone or color to a blank text or paragraph so that it sounds more interesting or appealing. So that's one example. And then for images, we have this Dali. So they can generate images, different images based on the user's specification. You can write a natural language expression to generate an image. And then we have coding assistants like GitHub Copilot that generates code, actual code, executable code based on user's prompt. And then so many things we can see. Ex summarizers, some tools can extract the transcript of a YouTube video and put it into a good use. So I mean, they've been there for quite some time. So and it's expected to grow in the next couple of years as well. Now that we understand the basics of generative AI, now we have the million dollar question here could be the billion dollar. That is, how can we use my business data with generative AI model so large language model that means how they can how can I get the help from generative model to analyze my business data and come up with something new. Before that, let me talk about some limitation for the generic nature of these larger language models. I'm speaking particularly about large language models or LLMs. Let me take some. So when you, when you browse the internet, you can see the open AI chat GPT three or GPT four base interface, and then Google came up with this Google barred and then Microsoft came up with this being and then we have hugging face and all these public models available there. The common thing for both of all of these model is that they've been trained using data available on public internet. That means they've been they are pre trained the GP in the latest of GPT. It says generate tube pre trained transfer model. That means it's you can buy this model off the shelf from this particular company and you can use it from the day one. That means there's nothing to do with in terms of engineering side, you can just use it off the shelf. That's the beauty of it. And that beauty also comes with some limitation. That is, there are some bias towards some answers because the model has been trained on public internet. There can be some bias towards certain answers. For example, if you ask a question, the answer could be biased towards something. And also, the most important thing is chat GPT doesn't know about your organization's data. For example, if you are an architect or a decision maker from your organization, you cannot ask the chat GPT, tell me whether John Smith is going to buy this item or not. Why and also, can you generate an email to my customer, something like something specific to your personal business data. That is because chat GPT or this LLM has no context on your business data. So that's the limitation of the concern that we are trying to find an answer today. There are other ways we can answer this question problem. So there are so many options. So let me highlight the best. The first comes building your own generator. It's like a boiling the ocean approach, probably not the best idea. Let me tell you why. Building your own LLM, large language model or generator model, it's like a It takes a lot of effort and time because in order to train this data, you need a large data set. And this is usually called the corpus and then building it also requires some skill sets, especially in data scientists and statisticians and rare to find skills. And also, even if you build the data, build the model, it will take more comparatively expensive operations to do the training. And you might need more large clusters and the training will run for a couple of days or even weeks based on the configuration. And it might give you a staggering bill of cloud. And those are the concerns why you should not build your own large language model or LLM. I mean, if you can do it, good, but for a majority of people, it's going to be a problem or a challenging. So and then that gives us the more practical approach. So what would be the best approach to deal with that if you are not building a large language model for your organization? So next best alternative is by one of these base models, so foundational models from Internet. You can take chat GPT, Google bar, whatever, and then fine tune it to inject your enterprise data. That's one other approach. And this is called specifically called fine tuning. And it's the process that taking a pre-existing language model and training it on specific data. So this is again, it's a complicated process. And this requires you to have broader skills on data science, machine learning and data engineering as well, because you're going to dissect the model and freeze several layers and then do the training. So I'm not that familiar with this domain as well. It seems like it's a operation that really needs good expertise on machine learning. And the most, the desired target is, you know, that in business, the data is generated continuously. And even if you fine tune this model today, there's a possibility of this training data set getting outdated as soon as possible when new data comes in. And that's the biggest design target in this fine tuning. And that leads us to find a more less complicated and also cost effective approach of using LLMS. And that is where we come to a prompt engineering. So what is prompt engineering? So I keep dropping jargons. So let me bust these jargons. So let me get back to the initial definition. So as you, as you remember the diagram I showed you, so we as a human, we use a prompt to deal with the communicate with the large language model. So this prompt could be in a form of a question, like a conversational style, or could be a command like, like I'm instructing or commanding the model to generate something. Or it could be in a question like asking a question just for learning purpose. And then based on its training data set, LLM can synthesize the answer and respond back. Now we are going to look at a less complicated prompt engineering team. That means can we engineer the prompt to contain my personal data because we all know that the LLM has no access or has no awareness of your personal data. So this is where we use this concept of prompt engineering. So in prompt engineering, the description of tasks embedded in the user input instead of being explicitly given. So it's like asking the LLM about a specific user, specific customer in your business. You do it beforehand and before, before sending the prompt to the model, you can prepend you can build the user context beforehand by reading your customer data and stating the aspects and we can call we call it a context. So basically we have our data here and what we do is we build a prompt. It's like a pre prepending your prompt with your business. Let me give you an example. So let's say we need to notify customers about potential flight delay. For example, this particular flight has been delayed and we need to generate a good or personalized notification for each customer that will be that would be affected by this particular delay. So when we do that, we can build the context beforehand like this. So this is the context as you can see, I'm stating the facts. Okay, this flight number is this departure time is this and and these are like, these are specific to your business. Just keep that in mind. This is specific to your business. And then I can give the instruction to the model. So you can think of this like a two tier approach. First you do the context, you build the context, and then you instruct. So basically, this, this instruction is like a processing comma processing logic, like you like write code, you can instruct. So I'm instructing the model saying my customers flight has been delayed, and then digest this information and generate a message to customer blah, blah, blah, blah. So that's the approach we are going to look today. Right. And then, once we do that, the model can let me go back. Once we so when you pass this whole thing, the prompt, the prompt see everything in one place. And based on this information model can synthesize it's output. And it will be a lot faster because the amount of information that model has to process is low and everything is contained in the single prompt. So how do we provide this context? So what are the mechanics here? So there are several ways. So the easiest way is when a user types a question, we can do that on demand. That means when when user ask a question, let's say, let's say you are a customer for a airline company, an airline company provides you with a chatbot or chatager, and then you can ask, you can log in and you can type this question. Can I get an extra back luggage for this particular flight? And then what would happen next is, on your behalf, this chat application will query a database or something like, let's say, for example, let's say a database, a generic database, and it can fetch your personal details, like your age, your TMI, your loyalty points and everything. And then it can build this context for you. And on behalf of you, this context will get prepared to your prompt. And like I said in the before slide, there will be two tiers, the context and your prompt, and the whole thing will get passed with the OpenAI model. So there are some things that some need things you need tweaking. For example, one thing that large language models does not have is a long term memory. That means they have a limitation when it comes to the amount of tokens that you can pass tokens in the same could be characters. So this token has a limitation. For example, GPT-4, as I remember, has a limitation of 8000 tokens per one session. Someone can correct me if I'm wrong. For that, we can use a vector database. So that's for long term memory. But you don't need a vector database all the time, because now the vector database has become a trend again as well. Another distinction is, if you know everything about your customer exactly, I mean, if you have keyed in your customer every transaction, for example, for a customer we have loyal deployments, pass transaction, everything in easily searchable manner. And we know exactly what we are looking for. And then go for it, we can use a faster database. And then if we have something, we don't exactly know about customers. For example, something like a policy, like a baggage policy or something. So this could be in a vector database, because the policy document and everything will be very large in size and it is impractical to prepend the whole thing into a single prompt, because it will eat up your token bandwidth as well. So the point here is, you don't have to use both. And depending on the circumstances, you can choose either a database or a vector database. This is still a debating thing we'll see. And then what are the challenges of this approach? So this is, we call it context injection. The most challenging thing is to keep data fresh and relevant. That means the data, the context we are injecting to a prompt should be fresh, and they should be still relevant. That means we should not use data that is older than maybe to this moment. I mean, batch driven approach doesn't work here. And the data retrieval must be fast. That means when the user types in the prompt and the mechanism should be very faster to respond with the customer data. So we can look at the whole big picture in a couple of in the latest slide. So those are the two requirements, the freshness of data and the retrieval or the read performance. So having that in mind, how are you going to find a solution here? So, and that prompts us to have data in motion. And it means data must be dynamic and must be fast. And then that gives, that leads us to use streaming data in our solution. And this is where we, this is the critical part of our presentation. And before that, before we proceed further, we have a poll question coming in. So you can all, it might take a couple of seconds to answer this question. It's a no brainer. So let us just answer this question. Let's just give some time and see some answers coming in. All right. All right. Let's close the poll and let's move on. All right. So let me introduce what is streaming data. I'm pretty sure you are already familiar with that. But for net new users or audience, let me define. So streaming data is composed of streams of data. That means a stream is a continuous, never ending flow of data with no beginning or an end. That means it is unbounded and the data is incrementally made available over time and it enables you to act upon this data without downloading it first. So the best example to understand streaming data is to compare it with a video streaming solution, something like Netflix. When you are watching a video on Netflix, you just press the play button and you just watch it, right? You don't have to download it. The video will be rendered as the new data comes in. So the same applies for streaming data. So only difference is the transport protocol we are using. In streaming video, we have RTSP, but in streaming data, we have different protocols for western things like that. And if we zoom into a stream of data, we can identify events. Event is the foundational building block in a stream. Event, a data stream consists of series of data points ordered in time. And each data point represents a state change occurred in the source system. Source system means that the place where the data is coming from, the origin of data. And as you see the diagram, you can see the event source is producing events and then these events are coming as a stream. And you can see the first event has the oldest timestamp, the time where the event has happened and then it goes through a different system. And this is connected to this whole event-driven architecture and of this subscribe system where publishers produce events and subscribers who are interested in getting to know about these state changes, they subscribe to these event streams. And then in the middle, we have this event bus or the streaming data platform in practical acting as a conduit between event producers and consumers. Now, let's see how we can leverage this streaming data to inject the customer context into a large language model. That's the million dollar question. So we can come up to this high-level architecture here. So everything starts from here, from the left side, as you can see. So this is our continuous operation. This is not a synchronous one. So I'll explain the difference. So we have corporate data, your corporate data measures, file systems, legacy applications, mobile devices and everything. So they keep pumping out events. They keep producing events. And then in the middle, we have a streaming data platform like Red Panda. So this is your customer data. And then we can use a different layer. The third layer is a stream processor or it could be a streaming database. Something like materialized rising wave fling. So there are so many choices you can choose from. So what they do is they ingest this customer data coming from Red Panda in real time and they can build materialized views. That means we can call it a customer 360 view. That's the standard term when we deal with these data warehouses and other analytical applications. So basically you can define a view with a primary key, with a custom ID and then the past transaction that he has performed or the loyalty points or something like that. And then the important thing here is this materialized view keeps getting up to date. In real time, it's not still. It doesn't stay still. So it keeps updating. How? For example, let's say if the customer changes his address, we can use a change data capture mechanism like the BCM to stream that change into Red Panda and then in a few seconds, that change will reflect in the underlying materialized view. So that's the true beauty here. And this will make sure that the customer data we maintain here is always up to date and it is easier to access. And then we have this workflow in the right hand side. So this is where in the middle, we have this application where you can build this application. So it could be anything, it could be a support agent exposed to your customers or it could be internal application, something like a help desk or something like a console where it is only exposed to your employees just to ask questions about your personal data or business data. And then we have this user asking an original question. For example, let's say this is a support board and the customer is asking question, am I eligible for an extra luggage or are you going to charge for me? Then secondly, application receives the prompt and it does this context injection. What it does is it queries this customer data. So because at the moment this user loads into this application, we can capture his custom ID and pass it to our materialized view and we can query the custom ID against the materialized view and get his complete picture based on the context. And then that context will be prepended to the prompt, original prompt and then we can call it enrichment and then we can pass it to the language model which is in the internet. So this is the pre-trained model. So this could be OpenAI chat GPT3 or GPT4 varieties and then we have hugging phase and then major cloud vendors, they have their own implementation as well. So here you can use APIs, there are APIs to integrate with the application, but if you use a framework like LangChain or Lama when there are so many things coming up, but LangChain could be the best option here, it can abstract these models for you so that you can focus on just for the integration. And then the model will do the synthesizing and it will answer and then answer can flow down to the user level. So that's a very similar schematic of our streaming data and large language models working unison to fulfill a user request. Yes, and I have a question. How do you get a custom ID from an app? Yes, good question. Usually it's up to you. So basically if you are designing a mobile application or support book, you can capture it upon the login or there are so many ways. And like I said, if that is an internal application, you can capture the employee ID and if it is an internal application that is more easy, then you can ask the employee to enter the custom ID. So there are a couple of options here. Right. Now, what is the role of Red Panda here? So as you can see, here we have Red Panda acting as the buffer or they can do it of ingesting all this data and passing it to the processing layer or stream processing layer. So for those who are new to Red Panda, like I mentioned, it's a streaming platform API compatible with Apache Kafka. And that means if you already have Kafka producers and consumers, you can seamlessly integrate them with Red Panda. Since Red Panda speaks the Kafka language, so that makes this seamless integration possible, but at the same time Red Panda has some architectural differences, significant differences that makes us different from Kafka in terms of a scalability, a simplicity in operations and cost effective, I think, when you operate them in production. If you're interested in finding out more, you can go to redpanda.com and check out the website and the documentation and associated YouTube channel as well. And make sure to check out the chat for more links. And there's a question from Rudraksha. Does Red Panda provide any free tier to using person project as a student? Yes. So we have a cloud option, Red Panda cloud. So you can register for an account and you can spin up a Red Panda cluster and you can give it a try. And that's the easiest option. But if you're interested in just trying it out, you can use these docker images and we have Kubernetes Helm charts. You can try out them as well. All right. Okay. So what are the benefits of this approach? Streaming data and generating AI. The biggest benefit is the streaming engine ensures that business data is always fresh and relevant. And we will make sure that the model is not dealing with a customer data item that is older than the current date. And also the underlying mechanism like materialized databases and vector data. So they will provide fast access to data. That means whenever the prompt we do the context ingestion, these databases are being designed to provide more fast access to customer information. And this whole thing, I mean streaming data and generating AI unlocks many use cases and business potentials. For example, this will enable advanced data analysis within organization. For example, you have pile of organizational data that has been collected for several years. And you have this obsolete data warehousing tools and BA tools. But if you integrate this approach, streaming data and LLMs, you can ask LLM to solve some different questions, problems for you. For example, a department head is asking LLM, is my department going to meet these goals? Probably the CEO must be asking, is my business profiting and is it going to loss? What's the projection? So those kinds of things. Even non-technical users use natural language to ask their question and interact with it as if they are talking to a real human. That's one use case. Second one is AI assisted marketing. That means this LLM can be used to generate called emails based on the past customer interactions. And then also we can offer hyper personalized user experience to users, especially for digital customers and based on the real time interaction with this stuff. For example, if a train is getting delayed, we can ask LLM to generate a personalized recommendation for next best action. For example, should I book a hotel as a passenger or should I take the next available train or should I cancel my plans, something like that. And those things will drive more loyalty, engagement and sales ultimately to the businesses. And then if we go beyond the typical LLMs and if we consider the other model variants here like GANs, autoencoders and RNNs, there are few other use cases. For example, we have this data augmentation or imputation possible. For example, when you are processing data from IoT devices or time series processing, there can be missing data points in lots of situations because of the unreliability of the network and there can be few gaps. And in such cases, we can use a model to generate the missing pieces and we can do some calculation to smoothen the calculations. So this is a typical flow. So this is not exactly real time, but this could be some near real time use case. So for example, this is here, the MQTT data coming in and then strings of the link can be doing the stream processing and the model inferencing and imputation can happen in this flow. So at the end, we have the complete data with full fidelity. And then potential another use case would be dynamic generation of gaming content depends on depending on the user's interaction. For example, if you love McDonald's or KFC and then if you play a game, probably at the mid game you might see a billboard of McDonald's in the middle of the game. So because it is generated automatically based on your customer, past interaction with that business and there are so many possibilities. And this is widely used in practice. Probably if you are using Grammarly, it's a writing assistance. It can suggest a better way of writing as you are typing. Like for example, once you are doing writing a paragraph, you can summarize it so it can add a different tone or color to what you're writing to something more better. All right, I think we are at the top of the hour. We can summarize what we have discussed so far. You know, Generative AI has been there for some time around. And the biggest challenge here is how you can use your business data with the model. AI Generative AI model to make your business better. And then there are so many ways we discuss. Some have challenges like building your own LLM is not an option and fine tuning is quite complicated and the best and the cost efficient. So usually understandable approach will be to use context injection with some mix of prompt engineering. And there we can utilize streaming data to make to ensure that the customer data is always moving and it's dynamic and available whenever it's needed. And then we discuss a couple of use cases where we can use real time the Generative AI in the betterment of streaming data. And that's all I wanted to share with you today and thanks for joining and listening. I can take a couple of questions right now. Manju is asking how many sellers are required for a mid-sized company. I presume this is about Red Panda. So I can answer in context of Red Panda. So if you are starting up from Red Panda, you can just talk with the right just one node. Or if you are going with a containerized deployment, just one container. Just to play around with that. But when you are scaling it out of production, we recommend at least three Red Panda servers just to keep the core, you know, the core and to make it for tolerant during a deployment. So there are so many deployment patterns to consider when you scale out to production. Go and check out Red Panda documentation for more detailed answer. But in summary, you can just start with a single one. Okay. I don't see no further questions. So thank you all and allow me to hand it back to the NAP Foundation Candice or to you. Thank you so much, Dimit for your time today and thank you everyone for joining us. As a reminder, this recording will be on the Linus Foundation's YouTube page later today. We hope you have a wonderful day and that you join us for future webinars. Thanks so much.